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Preface 


IN THE SUMMER of 1958 The Johns Hopkins University was 
presented with a unique opportunity to experiment in the field of 
mathematics teaching at the secondary-school level. This oppor¬ 
tunity was given to us by the Esso Education Foundation. Aside 
from expressing an interest in the improvement of science and 
mathematics teaching, the Foundation granted the University a 
free hand to do what it considered to be the more significant 
investigation. 

After a series of consultations involving various members of the 
science faculties and the administration, a decision was reached to 
investigate the field intermediate between pure mathematics and 
the natural sciences. This field was that of applied mathematics or 
scientific mathematics. 

The actual program took the form of a special in-service training 
program for a selected group of secondary-school teachers. During 
the two academic years 1958-1959 and 1959-1960, one-hundred 
secondary-school mathematics and science teachers from the city 
of Baltimore and Baltimore County participated in the course. 
These participants have had a marked influence on this book as it 
now appears in final form. 

Together with Dr. Joseph Sampson of the Department of Mathe¬ 
matics, the author took the major responsibility for the course to be 
presented. We were aided from the beginning by the advice and 
support of two committees—one from The Johns Hopkins Univer¬ 
sity, and one composed of the science and mathematics supervisors 
from Baltimore and Baltimore County public and private schools. 
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the Esso Research and Engineering Company; Mr. William P. 
Headden of the Public Relations Department, Standard Oil Com¬ 
pany of New Jersey; and Mr. George M. Buckingham, present 
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Notation 

lectors which are designated by a symbol such as 
K in the illustrations will be represented by 
bold face letters such as A in the text. 

Matrices will be represented by open face letters 
such as S in the text. The elements of a matrix 
are shown as symbols having two subscripts. 
Punctuation at the end of equations has been 
omitted throughout in the interest of clarity. 
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CHAPTER 1 


Geometry 
and Matrices 

A. Description of £3 

EVERYONE IS ENDOWED to some extent, at least, with an 
understanding of the physical space that surrounds us and the events 
of nature that take place in that space. Such events as the motion 
of material bodies, propagation of sound waves, and chemical and 
biological changes are in many cases perceptible. It is the business 
of the scientist to describe events of this type as accurately as he can. 

A description of natural events 
can be phrased in ordinary lan¬ 
guage, of course, and before New¬ 
ton’s time nearly all scientific ob¬ 
servations were cast in that form. 

But verbal language has been 
found to be extremely unwieldy 
and unsuited to accurate descrip¬ 
tions of natural phenomena. 










































The scientist, therefore, must use the language of Mathematics. 
In order to use this flexible and more precise language he must rep¬ 
resent the phenomena of nature by mathematical objects , which 
correspond to observations in some manner. In the pages to follow 
we shall give an account of how this is done as well as an account 
of some very important mathematical concepts. 

A given set of mathematical objects is sometimes like the parts 
of a game. Symbols, pictorial or abstract, are supplied the con¬ 
testant. He is then provided with a strict set of rules for manip¬ 
ulating the symbols. The important aspect of the game is that 
once the rules have been set they can never be violated at any 
extension of the game. 


Very often it is difficult to decide just how to represent a situation 
in nature by a mathematical object. The choice of the mathematical 
object is never unique, and the usefulness of a given choice depends 
largely upon the skill and insight of the scientist. However, a large body 
of experience has established the usefulness and accuracy of certain standard 
mathematical objects as sufficient representatives for a variety of real things. 

Among these useful objects is the three-dimensional Euclidean 
space (we shall call it simply E 3 ). E 3 is utilized as a mathematical 
representative of a real local space. 

We use the word local because even at first thought we should 
not expect to extrapolate the properties of the space observed in 
our immediate vicinity to the entire space of the universe. More will 
be said about the concept of a local space later in this book. 

Asserting that a two-dimensional space is represented by the 
plane of this page (called E 2 ), then E 3 is simply a three-dimensional 
version of E 2 . In other words, besides including all points in the plane 
of this page we include all points in parallel planes above and below. 

The student will recall how E 2 is described by means of certain 
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give presently. 

The axioms which we shall use are in the form best suited to our pur¬ 
poses and are somewhat different in appearance from the familiar 
axioms of plane geometry. 


E 3 is first of all a collection of (undefined) 
objects called points. These points are imag¬ 
ined as corresponding to the points of the real 
local space . The axioms of E 3 assign to every 
pair of points P, Q a number called the dis¬ 
tance between P and Q. This distance is sup¬ 
posed to represent the physical distance be¬ 
tween the real points corresponding to P and 
Q (measured in appropriate units). 

Certain collections of points in E3 will be called straight lines. 
These will be defined later. They are used to represent the straight 
lines of the physical space. 

We point out here that we can define the straight lines in a real local 
space as the paths of light rays. This is to say the straight line joining 
two real points is by definition the line of sight from one point to the 
other. 

It is of some interest to note that the straight lines of E 3 for most 
problems will correspond to the light rays of the physical space to 
a high degree of accuracy. For instance if we represent the refrac¬ 
tion of light in an optical lens by lines in E 3 we obtain a very ac- 



3 





























































curate correspondence between the diagram in E 3 and the physical 
results in our local space. 

On the other hand there is reason to believe that the paths of 
light rays will be affected by the presence of material bodies. The 
propositions of General Relativity suggest that for astronomical 
distances the straight lines of E 3 will not correspond to light rays. 



In spite of the limitations of E 3 when applied to problems which 
deal with astronomical distances, we can use E 3 as a mathematical 
model of our local space to a high degree of accuracy. 
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In the nineteenth century, C. F. Gauss proposed an experiment 
to test the validity of E 3 . 

Consider a physical triangle whose vertices A, B, and C lie on three 
mountain peaks. The sides of the triangle are then the lines of sight 
joining the three points in question. Suppose now that we measure 
the three angles of our triangle as accurately as possible. 

There is reason to suppose that the sum of the three angles will not 
add exactly to 180°. The departure of the sum from 180° would 
provide us with a measure of the validity of E 3 for such distances as 
are involved. 

Interestingly enough, such measurements add to 180° within the 
experimental error of present day measurements. Thus E 3 is adequate 
for most local problems. 

Once a departure from E 3 is detected, a new or modified mathe¬ 
matical model must be substituted whenever great accuracy is re¬ 
quired. Such a model must behave as E 3 in those approximations 
in which E 3 is known to provide a description. 


B. The Axioms of E% 

A precise description of E 3 can be 
presented by stating the axioms of E 3 . 
As has been stated previously, E 3 is 
composed of a set of abstract objects 
called points . 

The axioms of E 3 provide a relation¬ 
ship between these points. 

Many equivalent axioms can be 
given. For our purposes the following 
two axioms are the simplest: 



Axioms of E 3 

1 . For any two points P, Q there is assigned a number d, where d > 0 . 
This number is called the distance between P and Q. 
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2. It is possible to assign to every point of E 3 three numerical co¬ 
ordinates x, y, and z such that: 

(a) For any triplet of numbers (x, y, z) there is one and only one 
point of E 3 having (x, y, z) as its coordinates. 

(b) If P has the coordinates (x, y, z) and if Q has the coordinates 
(x', y', z') then d, the distance between P and Q, is equal to 

[(X - x ') 2 + (y - y ') 2 + (z - z ') 2 ] 1/2 


These two axioms contain all that we shall need to know about 
E3. Since the points in E 3 are thought of as corresponding to the 
points of physical space in such a way that if P and Q in E 3 corre¬ 
spond to Po and Qo in real space, then d, the distance between 
P and Q, is equal numerically to the distance between P 0 and Qp. 



Before discussing the coordinate system we should note that the 
definition of the distance between two points (x, y, z) and (x', y', z') 
implies the Pythagorean theorem. To be more specific our definition 
of d requires that the line (x — x'), the line (y — y'), and the 
line (z — z') all be mutually perpendicular. The proof of this can 
be seen readily from the diagram below. 

In the section to follow we shall find that the rule 



= (x - x ') 2 + (y - y ') 2 + (z - z ') 2 

will require a set of coordinates 
which are mutually perpendicular. 

As an example of the definition of 
the distance between two points in 
E 2 let us consider the following 
problem. 



Point P is located by the coordinates (3, 4). This is to say x = 3 
and y = 4. 

Assume in this case point Q is given by ( — 5, 1), i.e. x' = —5 
and y' = 1 . 

The distance between P and Q is then 

d = [(3 — {-5 }) 2 + (4 - i) 2 ] 1/2 = [ 73 ] 1/2 


C. The Coordinate System 

and non-orthogonal s 

The distance u d” in E 3 have been 
defined without reference to a spe¬ 
cific coordinate system. 

As we have mentioned in the last 
section our definition of the distance 
invokes the Pythagorean theorem. This definition implies that 
each of the triplet of numbers specifying a point (such as the num¬ 
ber) x in (x, y, z) corresponds to a set of measurements along 
three mutually perpendicular axes 1 . 

Utilizing our spatial intuition we can specify that the perpendic¬ 
ular (or orthogonal) coordinates are formed by the intersection of 
three mutually perpendicular planes. As an example the walls 
forming the corner of a room form a set of orthogonal (or perpen¬ 
dicular) coordinates. 

1 By perpendicular we mean that the angle between any pair of axes is 90° (meas¬ 
ured in the plane formed by the two axes). 


Orthogonal 
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The coordinate x, is found by passing 
a plane through P parallel to the yz 
plane. The intersection between this 
plane and the x axis specifies the x co¬ 
ordinate of the point P. 

In like manner a plane through P par¬ 
allel to the zx plane provides the y co¬ 
ordinate of P and a plane through P 
parallel to the xy plane provides the z 
coordinate. 

We then see that the original planes 
defining the coordinate axes plus the 
three planes defining the coordinates 
(x, y, z) of the point P form a rectangular 
parallelepiped. 

The coordinate axes are defined as that 
collection or sequence of points whose 
coordinates are 

(x,o,o); the x axis 
(o,y,o); the y axis 
(o,o,z); the z axis 


This particular system is called CARTESIAN . 2 


By passing from the points of E 3 to 
their coordinates we are able to translate 
geometrical problems into numerical 
problems. Geometry referred to a specific 
coordinate system is called analytic (or 
coordinate) geometry. 

It should be emphasized that the co¬ 
ordinate system is not unique; it only pro¬ 
vides a frame for reference. 

2 In some texts the term Cartesian has a slightly 
broader meaning than we have given here. 
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P can be specified relative to the frame O or relative to a 
frame O'. 

In the preceding diagram a second set of axes are shown desig¬ 
nated by the origin O'. We see that the point P could have been 
described by the triplet of numbers (X, Y, Z). 

Throughout this discussion we have emphasized the condition 
that our coordinate axes formed a mutually perpendicular set of 
axes. A set of non-orthogonal axes could have been used. It is pos¬ 
sible to describe the point P in terms of any non-coplanar 3 set of 
axes. However, in doing so we should violate axiom 2 (b), which 
defines the distance between two points. 

To illustrate this let us consider a simple two-dimensional set of 
axes. A coordinate system satisfying 2 (b) (the definition of d) is 
called orthogonal or Euclidean. A two-dimensional Euclidean sys¬ 
tem can be called E 2 , where 


d 2 = (x — x') 2 + (y - y') 2 



Regard the non-orthogonal two-dimensional system shown. 

This system is formed from two intersecting straight lines which 
intersect forming the angle a between them (a in this case is less 
than 90°). 


This is an example in which the 
distance between P and Q is not 
given by the square root of the 
sum of the squares of the coordi¬ 
nate differences. 

3 By non-coplanar axes we signify that the three axes cannot lie in the same plane. 
This statement also provides the condition that no two axes are parallel (remem¬ 
bering that the axes meet or intersect at one point called the origin). 
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We must define the coordinates in the same manner as in the 
case of the orthogonal system; noting simultaneously that a plane 
through P in E 3 becomes a line through P in E 2 . 

Passing lines parallel to b through P and Q, we find the inter¬ 
sections with the “a” axis which provide the coordinates a p and a q 
of P and Q, respectively . 4 

Using lines parallel to “a” through P and Qwe obtain b p and b q . 

The distance d between P and Q, is now defined in terms of the 
two sides of the triangle (aq — a p ), (b q — b p ), and the angle (tt — a). 


d 2 = (aq — a p ) 2 + (b q — b p ) 2 — 2 (a q — a p ) (b q — b p ) cos (tt a) 


It is apparent that this is a general relation since the distance “d” 
for E 2 is obtained in the special case in which the angle a is equal 

to 77 / 2 . 
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^ Counter- Clockwise 



X 



Right-handed and Left-handed Cartesian Systems 

Another property of a coordinate system which must be noted is 
the order in which the coordinates are taken. This order deter¬ 
mines whether a physical coordinate system is “right handed or 
“left handed.” 

To illustrate in an elementary manner the difference between 
these two systems consider the standard E 2 shown at the right above. 
At the left is pictured a left handed two-dimensional space . 

In E 2 the positive x axis must be rotated counter-clockwise to 

4 The subscript P on “a” merely denotes that ‘a p belongs to P. 
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obtain the positive y axis; under these conditions the system is called 
a right-handed system . 

On the other hand the left-handed system has the property that 
the positive x' axis must be rotated clockwise in order to obtain the 
positive y axis. 

It is the convention in problems of mathematical science to choose 
a right-handed coordinate system. 

This concept of the “handedness” of a set of coordinates is 
physiological, yet we find in physics that results of certain experi¬ 
ments (now labeled the parity experiments) are not independent of 
the “handedness” of our choice of laboratory coordinates. 

Before discussing E 3 it should be noted that the “handedness” of 
our system could have been denoted by the doublet of numbers in¬ 
dicating a point P. 


(x, y) corresponds to a 

right-handed system 

(y, x') or (— x, y) corresponds to a left- 
handed system 

Later in the section concerning rotations the difference in the 
rotation properties of the two systems will give us other differences 
between the two systems. 

In discussing the two-dimensional case the words clockwise and 
counter-clockwise were used. It becomes apparent then that the 
differentiation has a physiological aspect. We shall define a right 
handed system strictly in terms of the human body. 






















Consider a physical coordinate system whose axes are labeled 
x, y, and z. 

This system is right handed if a man whose body lies along the z axis 
with his right arm pointing along the x axis obtains the y axis by ro¬ 
tating his right arm across the front of his body toward his left arm. 

We can also describe this procedure by the right hand. Let the 
thumb of the right hand be along the axis corresponding to the z 
axis (the coordinate listed last in the triplet of numbers specifying a 
point, i.e., (x, y, z)). Initially the fingers of the open hand point 
along the x axis (the coordinate listed first in the triplet of num¬ 
bers). The y axis is obtained by rotating the fingers toward the 



a left handed system. By inversion of an axis we mean that the axis 
of the inverted system corresponds to the negative axis of the right 
handed system. 

In the triplet of coordinates the order of writing the coordinate 
can indicate the handedness. The order can be changed in a cyclic 
order with changing the handedness. Our meaning of cyclic order 
can be shown by listing the “right handed” orders and the “left 
handed” orders. 
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( x > z > y) (x, y, z) 

(z, y, x) left handed (z, x, y) right handed 

(y> x > z ) (y, z, x) 


Although we shall not specifically deal with the order it is per¬ 
haps interesting to note this property here. 

In addition we perceive that there are only two orders possible. 
Below is shown a comparison of a “right handed” and a “left 
handed” system obtained by the inversion of the y axis. 



For many years scientists believed that the choice of coordinates 
could be taken arbitrarily. That is to say that the results of any phys¬ 
ical experiment should not depend upon the left hand or right hand 
properties of the coordinates. Because of this arbitrariness the laws 
of physics were always set up to be independent of this particular 
property of the coordinates. 

This property of a physical system is referred to as parity. 

In an attempt to describe this problem in terms of everyday ob- 
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servations; assume that the scientists on earth are able to make 
radio contact with a similar type of being somewhere in outer space, 
and further assume that the communications can be verbally 
understood. 

Now hypothetically we find that this “being” in outer space has 
essentially the same make up as the “being” on earth, i.e., two 
arms, two hands, a head, two legs, a heart, etc. After much conver¬ 
sation we decide to correlate an experiment on earth and an experi¬ 
ment in outer space. 

The experiment is that of setting up a magnetic field by winding 
a current coil. The instructions of the earth people for finding the 
direction of the magnetic field are as follows: 

“Take one current loop. Curl the fingers of the right hand in the 
direction of the current, and then the thumb of that hand points in 
the direction of the field.” 





These instructions are conveyed to 
outer space, and at this stage we re¬ 
ceive a question from our correspond¬ 
ent. This question provides a stumbling 
block. He asks, 


"Which of my two hands should 
% / call the right hand?" 


Our first impulse is to say that his 
right hand is on the side opposite his 
heart. Let us believe for the sake of argument that his reply is to 
the effect that his heart is in the center of his body midway between 
the shoulders. Therefore we cannot find any way to uniquely choose 
his right hand. 

For this reason (keeping our special example of current loop and 
magnetic field) for many years scientists assumed that the results of 
experiments in magnetic fields would not uniquely determine the 
direction of the field. More explicitly let us say that if a radioactive 
source emitting electron is placed in a magnetic field it was 
expected that just as many electrons would be emitted in the 
direction of the field as would be 
emitted in the opposite direction. 

In 1956 because of questions con¬ 
cerning this postulate proposed by Dr. 

Frank Yang and Dr. T. D. Lee, experi¬ 
ments showed that certain natural 
phenomena did indeed uniquely de¬ 
scribe the difference between a right 
and left handed coordinate system. 

A casual description of the results is 
that in the presence of an external 
magnetic field certain electron emitting 
radioactive sources do emit more elec¬ 
trons in one direction than in another. 

These experiments are called the 
parity experiments . 
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Thus to find the right handed system for the man in outer space 
we could ask him to repeat the parity experiments and after noting 
the direction corresponding to the emission of an excess number of 
electrons he would be able to uniquely assign a directic 
magnetic field. 


Curvilinear Coordinates 

In many problems of physical and 
mathematical interest the so-called 
“curvilinear coordinate systems” are 
the more convenient for a given prob¬ 
lem. Since a point in physical space 
requires three numbers to define it 
relative to a specified coordinate sys¬ 
tem, it is possible to specify the point 
by three numbers other than the length 
measured along the x, y, z axes of E 3 . 

To introduce this possibility in the simplest case consit 
dimensional space E 2 . In addition to the doublet (x, y) which 
specifies a point P we can choose a polar coordinate system which 
provides us with a doublet consisting of the distance p of the point 
P from the origin and the angle <j> which the line connecting the 
origin and P makes with the x axis. (In the xy plane of E 3 we shall 
also denote the radial distance to the projection of P by the co¬ 
ordinate p, and we shall denote the angle between p and the x axis 
by the coordinate angle <f>). 




X Axis 


A comparison of coordinates is shown below. 



Other curvilinear systems are possible but need not be discussed 
here. 

Even more interesting are the curvilinear systems of E 3 . The two 
most important curvilinear systems are the cylindrical coordinate 
system and the spherical coordinate system . 

The cylindrical coordinate system takes as coordinates p and <J> 
of the polar system in the xy plane plus the z coordinate of E 3 . The 
point P is located by the intersection of three surfaces; 

1 The cylindrical surface of radius p hav¬ 
ing its axis coincide with the z axis. 

2 A plane containing the z axis and the 
point P. 

3 A plane parallel to the xy plane at a dis¬ 
tance z from the xy plane. 

A point P in the spherical system is also 
located by two surfaces and a line which 
define the associated coordinates. These parameters are: 




1 A spherical surface whose center lies at 
the origin and which passes through the 
point P. This spherical surface has a 
radius r, the distance between the origin 
and the point P (r 2 = x 2 + y 2 + z 2 ). 

2 The straight line between the origin and 
P makes an angle 0 with the z axis. 

3 A plane containing the z axis and the 
point P. The line of intersection of this 
plane and the xy plane makes an angle </> 
with respect to the x axis. 
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There are various other curvilinear systems defined by ellipsoidal, 
hyperbolic surfaces, etc. However, our main intent is merely to in¬ 
dicate by the examples above the possibility of systems other than 
cartesian. 

The Spherical Coordinates incidently are very useful in prob¬ 
lems which deal with motion relative to the surface of the earth. 


The Length Element 

The functional dependence on the coordinate variables which 
the length element “d” takes in a curvilinear coordinate system 
can be obtained in two ways, both of which are equivalent. 

The first method which can be used is that of geometrical con¬ 
struction. The second method is performed by first obtaining the 
relations between x, y, z and the variables of the curvilinear system 
and then substituting directly into the equation. 


d 2 = (x - x') 2 + (y - y') 2 + (* - z ') 2 

To illustrate this we shall consider the distance d between a 
point P (x,y) and a point Q(x',y') in E 2 and in two dimensional 
polar coordinates with P at (p,</>) and Qat (p',<#>'). 

In the diagram below the two coordinate systems are shown. 



By axiom, 

d 2 = (x — x') 2 + (y —y') 2 
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By the law of cosines. 


d 2 = p 2 + p' 2 — 2pp' cos (</>' —<t>) 


To illustrate the analytic technique we see that 
x = p cos</> 

, and 

y = p sin<J> 

Also 

x' = p' cos <f>' 

and 

y' = p' sin </>' 

Thus 


(x — x') 2 = (p cos</> — p' cos<J>') 2 

and 



( y - y')2 = (p s in</> _ p'sin</>') 2 

Giving 

d 2 = (x — x') 2 + (y — y') 2 = p 2 + p' 2 — 2pp' cos (</>' — <f>) 

The procedure by which we obtain d 2 in the three dimensional 
systems is the same. The details of the development can be seen from 
the diagrams; the results are merely stated. 



d 2 = p 2 + p' 2 — 2pp' cos(<J>' — <j>) + (z' — z) 2 
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In spherical coordinates . 



x = r sin# cos<j> x' = r'sin#' cos </>' 

y = r sin# sin<J> y' = r'sin#' sin<f>' 

z = r cos # z' = r' cos #' 

Thus 

d 2 = (r sin # cos — r' sin#' cos<£>') 2 
+ (r sin # sin <f> — r' sin#' sin<£') 2 
+ (r cos # — r' cos #') 2 
and 


d 2 = r 2 + r' 2 — 2rr' {cos#cos#' + cos(</>' — <f>) sin # sin#'} 


The reader may verify this result by expanding the squared terms. 
The length element d can be approximated in a simple form 
when d is much much smaller than x and y. 

Let Ax be a number much smaller than x or y, and let Ay be a 
number much smaller than y or x. 

£>; (x-Ax,y+Ay) 


If Q is very close to P 

x' = x ± Ax 
and y' = x ± Ay 
then d 2 = (Ax) 2 + (Ay) 2 


AY 

"T 


P;(x,Y) 


I i , 


AX 


For the two dimensional polar coordi¬ 
nates, let 

p' = P + Ap 
and <£>' = <J> + A <j) 
then 

d 2 = p 2 + (p + Ap) 2 - 2p(p + Ap) cos (A<f>) 



Expanding the squared terms and dropping all terms of order 
Ap(A<£) 2 and higher 5 , we find that 

d 2 * (Ap) 2 + p 2 (A<t >) 2 

5 We must use the expansion of cos (A <f>) in powers of A <f>, 

i.e., cos (A<f>) = 1 - (A*) 2 + ^ (A<*>) 4 (-1)° ^7 ) 2n 
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77 ^ Carousel Problem 



Problems which are set up in 
curvilinear coordinates are of par¬ 
ticular interest to us because our 
frames of reference on the surface 
of the earth are strictly curvilinear frames. Sitting in the class room 
we find that the assumption of a cartesian space is a sufficiently good 
approximation for many purposes. 

Whenever the observer attempts to extend the cartesian approxi¬ 
mation too far he finds anomalies which can only be explained by 
reverting to the true curvilinear system. 

In order to gain some insight into these problems let us consider 
a two dimensional curvilinear system; in particular let us compare 
observations made by an observer standing on the earth with those 
observations made by an observer standing upon a merry-go-round. 
For simplicity we shall assume that the merry-go-round rotates 
with a constant angular velocity. 

The laws of physics will be considered as true in the frame fixed 
to the surface of the earth (to a good approximation). We shall Jind 
that certain phenomena which exhibit a particularly simple behavior in the 
earth’s frame become quite complicated when described by the observer in the 


rotating frame. 

Suppose for instance that the man on the carousel drops a marble 
from the edge of the carousel. 

The observer on the earth’s surface sees the marble move off in a 
straight line tangent to the carousel at the point of dropping. 

The observer on the merry-go-round however sees the marble 
curve radially outward and fall behind. 
























































Initially the rotating observer sees the marble moving radially 
away from him. He interprets this as some fictitious force which acts 
upon all objects in his frame of reference. To the force he might give 
the name centrifugal, and the radial acceleration which he observes 
he might call centrifugal acceleration. 

If the rotating observer is aware that the laws of physics hold in 
their given form in the stationary frame only he can then explain 
quite well the observations made from the carousel. 

By relating positions of the marble 
along the straight line in the stationary 
frame with the predicted positions of his 
observation point at corresponding times 
in the rotating frame he can predict the 
path observed from the rotating frame. 

It is sufficient to state here that if the 
straight line path of the marble in the 
stationary frame is transformed to the 
rotating curvilinear frame of the carousel; 
terms corresponding to the centrifugal 
and Coriolis accelerations appear quite 
naturally. 



Tixld 


Frame. ^ 

1 Cx 


L*:*. 



^Path 
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For the second problem, assume that 
the observer standing on the circumfer¬ 
ence of the carousel wishes to shoot an 
arrow into a post at the center. 

In his first attempt the rotating ob¬ 
server aims directly at the center. He 
finds that the arrow curves away from 
the center in the direction of rotation of 
the carousel. This deviation he again at¬ 
tributes to a fictitious force which he calls 
the Coriolis Force . 

To strike the center post the rotating 
observer finds that he must fire the arrow 
in a direction which compensates for the 
rotation of the carousel. 

The explanation for this is quite simple 
when viewed from the stationary frame, 
at the instant of firing the arrow has two 
velocities. 

One velocity provided by the bow Vb; 
the other velocity Vt arising from the 
motion of the point of firing with respect 
to the stationary system. 

The velocity of the arrow relative to 
the earth is the vector sum of the veloc¬ 
ities Vb and V t . 

Thus to strike the center point this 
resultant velocity must be directed along 
the line joining the firing point and the 
center. 

Again the path of the arrow once it 
has been fired is a straight line in the 
stationary system; while its path when 
viewed from the rotating frame is a com¬ 
plicated curve. 
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D. Coordinate Transformations 

In order to introduce the concept of changes in the coordinates 
or coordinate transformations in an elementary manner, we shall 
confine our discussion to a two dimensional space. 

Of the various changes or transformations which can be per¬ 
formed, only two will be of interest to us at this point; translation 
of the origin and rotation about the origin. 


Translations 

The simplest of all transformations is the operation of translation. 
Consider the space E 2 shown in the diagram to follow. 

On the left is the original coordinate sys¬ 
tem O with the coordinates of a point P 
being x and y. To the right of this diagram 
is the coordinate system after the origin has 
been shifted to O' (a distance A in the 
direction of the x axis and a distance B in 
the direction of the y axis). Rotation is 
specifically excluded and thus the new co¬ 
ordinate axes are parallel to the old. 

Obtaining algebraic relations between the new coordinates and 
the old coordinates is the fundamental problem in coordinate 
transformations. 

From the construction we can readily obtain the relation be¬ 
tween (x,y) and (X,Y). 

X — x — A, and Y = y — B 

It is important to find in this elementary case that the operation 
of translation does not change the distance “d” between the two 
points P and Q. 


L 


p c* 

cr (X'.Y') 

p 




(XO0wCX ? Y) 

* 

I 1 

1- 

■*- X 


In more elegant language we say that the distance “d” is 
invariant (unchanged in magnitude) under the operation of 
translation. 


d 2 = (x - x')2 + (y - y ') 2 = (X - X') 2 + (Y - Y') 2 


Rotations in E 2 


We shall approach the problem of rotating the coordinate sys¬ 
tem about the origin first as a purely geometric problem. Once we 
have obtained the relations between the coordinates after rotation 
and the coordinates previous to the rotation, we shall show how the 
results of the geometric problem can be set up as a problem involv¬ 
ing a matrix operation. In order to facilitate the transition 
from one set of expressions to the other it will prove convenient to 



The subscripts 1 on xi and 2 on X 2 serve to distinguish the dif¬ 
ferent coordinates in E 2 (namely the coordinates which formerly 
were denoted by x and y). In other words the subscripts 1 and 2 
are labels only (not multipliers). 

Now rotate the coordinate system about the origin O through an 
angle a. 


AXIS 

A 


**-Xi AXIS 
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The new coordinates of the point P are 
xi' and X 2 7 . 

By the elementary procedures of trig¬ 
onometry we can obtain the relations be¬ 
tween xi, X 2 and xi' and X 2 '. 

xi' = (cosa)xi + (sina)x 2 



since 


In the same manner we can find X 2 ' 
by triangulation. 


x 2 ' = (— sina)xi + (cosa)x 2 

The coordinates before rotation, xi 
and X 2 , are related to the coordinates 
after rotation, xi' and x 2 ', by the linear 
algebraic equations above. This is called 
a linear transformation of the coordi¬ 
nates xi and X 2 to the coordinates xi' 
and X 2 '. 

To illustrate this transformation of coordinates regard the fol¬ 
lowing simple problem. 

Assume that there is a two dimensional cartesian coordinate sys¬ 
tem laid out upon the surface of the earth with the origin at O. 

Now we construct a circular merry-go-round with its axis of 
rotation at O. Coordinate axes xi' and X 2 ' are painted upon the 
floor of the merry-go-round. 












As long as the axes x\ and xi are aligned the coordinates of a 
given point have the same value in (xi, x 2 ) as in (xi', x 2 '.). This 
situation obviously corresponds to a rotation of zero degrees, i.e., 
a = O. 

Thus if 

xi' = (cos a)xi + (sin a)x 2 

and 

X 2 ' = ( — sin a)xi + (cos a)x 2 


Then for the particular case a = O 

xi' = xi 


*2 = x 2 . 


Axes aligned. 


From the condition of alignment let us now compare the location 
of a point P with coordinates (xi, X 2 ) in the system fixed upon the 
earth with the observed coordinates of the point P relative to the 
axes fixed upon the merry-go-round when the merry-go-round 
rotates through an angle a of 30° counterclockwise. 

In this situation a = 30° and 

x 1 / = (cos30°)xi + (sin30°)x2 = 

and 

x 2 ' = ( — sin30°)xi + (cos 30 °)x 2 = ^ x i + 


If we are now told that 


and 

then 


Xl = 40 ft. 


X 2 = 30 ft. 



Xi' = ^(40) + |(30) = 5 (4 y/3 + 3) ft. 
x 2 ' = - i (40) + ^(30) = 5 (-4 + 3 V§) ft. 
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These two algebraic expressions can be written in abbreviated 
forms and in terms of matrices. 

The field of matrices and linear transformations is naturally 
much broader than our example of the rotation of a coordinate 
system of two dimensions. However, once the use of matrices is 
understood in this simple problem the extension of the method to 
the more general problem is quite straight forward. 

Introduction of New Symbols 

let 

xi' = (cos a) xi + (sin a) X 2 = Sn xi +S 12 X 2 

and 

X 2 ' = ( — sin a) xi + (cos a) X 2 = S 21 xi + S 22 *2 

thus 

Su = cos a , S 12 = sin a 

S 21 = —sin a, S 22 = cos a 

We now introduce a further contraction of the notation presented 
above. We note that each equation is linear and has a general form. 


2 



k= 1 


The indices (or labels j and k) can be either 1 or 2. For instance 
j = 1, and we sum over k we obtain the first equation. We further 
find that in the case that j = 2, we obtain the second equation, re¬ 
membering that Sjk is defined above, 


k = 2 

Xj' = ^ Sjk Xk = Sji Xi + Sj2 X2 

k=l 

The summation sign 2 can be eliminated by the einstein sum¬ 
mation convention. This convention merely drops the summation 
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sign with the understanding that the repeated index k in Sjk Xk 
always demands a sum over all of the possible values of k. 

2 

Xj = ^ ^ Sjk Xk — Sjk Xk 
. k= 1 

Take an example: Let j = 1 then 

xi' = Sik x k = Sn xi + S12 x 2 

Note that all of this does not consist of introducing anything new. 
This procedure only makes the equations more concise and allows 
us to put more information into a smaller space. In such a form, 
however, the technique can be readily extended. 

To reiterate: 

Xj' = Sjk Xk implies in our simple problem, 

xi' = (cosa)xi + (sina)x 2 
X 2 ' = ( —sina)xi + (cosa)x 2 

These algebraic equations can also be written in the form of matrices. 


E. Matrices 


The characteristics of a set of linear algebraic equations lead to 
a particularly convenient and powerful representation in terms of 
matrices. 


A matrix is a rectangular array of mathe¬ 
matical objects. In our particular problem 6 
the matrices will be 

square, 


Recau 
Grcor 1 c 


SE OF A 


ornmx 


and the objects in the array will be 

6 In general the matrix can be rectangular; it need 
not be restricted the square way. 


cl v£A..t. 
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numbers; 


i.e., the coefficients of the algebraic equations. 
The square array 


A = 


A„ | 

Ai 2 \ 

V. 

A22 J 


is a square matrix. The 2x2 matrix above has been written in a 
general form having four components An, A 12 , A 21 , and A 22 . 
These four components will have numerical values which are spec¬ 
ified by individual problems. It is customary to perform the general 
discussion of matrices with the symbols Ai m . 

For instance the array 



is a matrix. In this specific example the symbols A nm have been 
assigned specific numerical values; namely 

An = 6 A 12 = — 1 

A 21 — 0 A 22 — 57 

This assignment has no general significance except to illustrate 
the form in which one might find a numerical matrix. 

Notice that while introducing this representation to the reader 
we have “sectioned off” the various positions in the array by 
dashed lines. Ordinarily this is not done and the matrix appears in 
the following fashion: 
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The technique of sectioning the matrix will be continued until 
the reader has become familiar with the representation. 

Before proceeding to the laws for matrix manipulation it should 
be recognized that the use of the 2 x 2 square array is not intended 
in any manner to limit the operations discussed to 2 x 2 matrices 
alone. Our remarks concerning addition and multiplication, etc. 
will apply equally well to arrays of higher order. 

• An example of a 4 x 4 array is shown below. 


/ 1 1 1 
/ An ! A 12 I A 13 I A 14 


A = 


I 


I 


I 

T 7 
A21 1 A22 l A23 l A24 

_[_ 

1 1 1 

A31 1 A32 1 A33 1 A34 

± -- L _L 

I A 42 | A43 | A44 

I I I 


\ 


Y^4i 


1 


Notice carefully the indices of the components of these matrices. 
Any one of the components is denoted by the general symbol, 


A 


nm 


where the indices n and m can have any one of the four values 
1, 2, 3, or 4. 

Using A nm as an example note that the leading index “n” indi¬ 
cates the row in which the component falls while the trailing index 
“m” specifies the column to which the component belongs. 

Thus the component A 32 is the component belonging to the 
third row and the second column. 


Third Row 



Second Column 
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When two matrices are added the components of the resultant 
matrix are made up of the sums of the corresponding elements of 
the original matrix; i.e., (A 4- B)^ = A mn + B mn . 

The addition of matrices obeys the 

Commutative Law 

A + B = B + A 


The Associative Law 

(A + B) + C = A +(B + C) 

and The Distributive Law 

n(A + B) = nA + nB 


The multiplication of a matrix by a scalar quantity n is accom¬ 
plished by multiplying each component by the scalar in question. 
As an example we can write 


( An Ai 2 \ 

= 

A21 A22 j 


( nAu nAi2 
nA2i nA22 


These properties of addition are stressed because the multiplica¬ 
tion of matrices does not obey the commutative law. 

The multiplication operation in the case of matrices obeys 
The Associative Law; 


(A • B) • C = A-(B-C) 

The multiplication of matrices does not necessarily obey the 
commutative law ; 7 

A'B^B'A 

The product of two matrices can in very special cases commute, 
however in general commutation of matrix products is not expected. 

7 The symbol means “does not equal.” In this case it is intended to mean not 
necessarily equal. 
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Before proceeding further with the details of taking products of 
matrices we should mention briefly a few of the characteristic 
matrices with which we shall be dealing. 

The transpose of a matrix is obtained by interchanging rows 
and columns. Consider the matrix A, 


A = 


An A 12 

A21 A22 i 


The transpose of A is written A with elements A n m; 

An A^N / An A21 

A21 A22 / \ A12 A 2 2 


A = 


In the illustration above the relation between the elements A nm 
of the original matrix A and the elements Ajk of the transposed 
matrix A is seen to be 

A mn — A nm 


It is apparent that the interchange of rows and columns corre¬ 
sponds to an interchange of indices on the components. 

The magnitude of a square matrix is defined as the determinant 
of that matrix; thus 

Magnitude of A = | A | = Det A 

The inverse of a matrix can be obtained if the magnitude (i.e., 
Det A ) is non-zero. 

After the discussion of matrix multiplication the methods for 
finding the elements of the inverse matrix will be discussed. 

One other form will prove extremely useful in future discussions; 
this matrix is the unit matrix . 8 The unit matrix is a square array 
having the value plus one for all diagonal elements and having the 
value zero for all off diagonal elements. 

The elements of the unit matrix have a special symbol called the 
Kronecker delta. 

8 This matrix is sometimes called the identity matrix. 
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Kronecker delta 


— ^jm 

= 1 whenj = m 
= O whenj ^ m 


The symbol which we shall adopt for unit matrix is 1 . 
Thus the elements of 1 are Sj m . 

An example of a 3 x 3 unit matrix is shown below 


I 



O 

1 

o 



Later as an exercise the reader should demonstrate that the unit 
matrix is the Identity Operator which is defined by the equation. 


I • A = A 


In other words the application of the unit matrix to a matrix A 
gives the same matrix A. 

Using the linear algebraic equations which define the transfor¬ 
mation of (xi, X 2 ) to (xi', x 2 ') and using the square array represen¬ 
tation for the matrix S, we define the rules by which we multiply 
a column matrix (or vector) by a square matrix . 

If a point P is defined by the coordinates xi and X 2 these coordi¬ 
nates can be set forth in an ordered column called a column matrix 
(or vector). 

The column matrix locating the point P is represented by the 
symbol r and is written 9 


r = 


xi 

X2 


9 The symbol r will be used to represent a vector in the chapter concerning vec¬ 
tors. The column matrix transforms like a vector and therefore will be referred 
to as such. 


34 


We have already represented S by the matrix 

Sn S 12 
S 21 S 22 

We multiply a row in the square matrix by the single column of 
he column matrix (or vector) to obtain a given element of the re- 
* ;ulting column matrix. 

Our two linear algebraic relations representing the rotation were 




xi' = S n xi + S 12 X 2 
x 2 ' = S 2 i Xi + S 2 2 X 2 

In terms of matrices we write this as 


Xl' 

/s„ Sx 2 \ 

Xl 


Sn xx + Si 2 x 2 

X 2 ' 

\S 21 s 22 J 

X2 


S 2 i xi + S 22 x 2 


One observes that the product of a column matrix and a square 
matrix gives another column matrix. 

Since the two column matrices above are equal their elements 
are equal. 


Xl' 


Sn xj + S X2 x 2 

X 2 ' 


Sl 2 Xl + S 22 x 2 


i.e.. 


K = Z 

Xj' = Sji Xi + Sj2 X 2 = y,S ik 


Xk 


To review this in terms of the previous carousel problem we 
again use a rotation of 30°. 

The vector in the stationary system is 


xi 

X2 


= r 
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The vector as seen from the merry-go-round is 



The operation of rotation through 30° is represented by the 
matrix 


.866 .500 \ 

= S(30°) 

-.500 .866/ 

The relation between the two sets of coordinates is provided by 
the equation, 

Xl 

X2 

The student should carry out the matrix multiplication and show 
that the same result is obtained as before, i.e., 


Xl 

X2' 




xi' = (.866) xi + (.500) X 2 
x 2 ' = (— .500) xi + (.866) xi 


Matrix Multiplication 


A single rotation of the coordinates 
xi, x 2 to the set xi\ x 2 ' through an 
angle a has been shown to be repre¬ 
sented by the equation 

r' = S(«)-r 



where S(a)n = cos a, etc. 
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The parentheses a mean that S (a) is the rotation matrix corre¬ 
sponding to a rotation through an angle a in a counterclockwise 
direction. 

Consider now a second rotation of the system through an angle 
(3 taking the coordinates (xi', x 2 ') to the coordinates (xi", x 2 "). 


Then 


Since 


thus 



The relations between S (/?) • 5(a) and S + ot) can be worked 
out quite simply by writing down the linear algebraic equations 
representing these transformations. 

r' = S(a) • r implies the two equations; 


and 


xi' = (cos a) xi + (sin a) x 2 
x 2 ' = (—sina)xi + (cosa)x 2 


The second rotation 

r" = S(p) • r' indicates the following relations; 

Xl " = (cos p) Xl + (sin p) x 2 ' 

and 

x 2 " = (— sin p) xi + (cos P) X2 

By substituting the expansions of xi and x 2 ' into these last rela¬ 
tions the transformation S(/? + a) is obtained. 
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xx" = (cos /?) {(cosa)xi + (sina)x 2 ) + 

(sin (3) {(-sina)xi + (cosa)x 2 ) 

x 2 " = (—sin/?) {(cosa)xi + (sina)x 2 } + 

(cos /?) {( —sina)xi + (cosa)x 2 } 

or 

xi" = [cos/? cosa — sin /? sin a] xi + [cos /? sin a + sin /? cos a] X 2 
X 2 " = [—sin/? cos a — sina cos/?] xi +[— sin/? sina + cos/? cosa] X 2 

These relations simplify to the forms 

xi"= (cos(/? + a))xi + (sin(/? + a))x 2 
x 2 " = ( —sin(/? + a))xi + (cos(/? + «))x 2 

This demonstrates that S(/?) • S(a) is indeed S (/? + a) and if we 
examine the forms closely we find that the element S ((3 + a) nm is 
the product of the n th row in S ((3) times the m th column in S (a). 

In other words 

2 

S(jB + a) nin — y, S(^)nk S(a) km 

ks! 


— S(/?)ni S(a)i m + S(^)n2 S(a) 2m 
To illustrate this regard the Sn element of 5(/? + a) 


cos/? , sin)8> 

S (j8 + a)= S(i8)-S(a)= [-+ - - -) 


cosa | sina 

-4-- 


y— sin/? | cos/?/ \—sina [ cosa y 


cos(^8 + a) , sin(/? + a)) 

-f- 

sin(^S + a) ] cos(/3 + a)) 
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The shaded areas indicate one particular row-column multi¬ 
plication. 

Here, 

S()3 + a) n = (S(/?) • S(a)) u = S0B) u S(a) n + S(jS) 12 S(a) 2 i 

= cos/?cosa — sinjSsina = cos(^8 + a) 

Once we have demonstrated the necessary multiplication rules 
required to provide consistency in the rotation through two succes¬ 
sive angles, the general problem of multiplication can be discussed. 

The multiplication of one matrix by another can be considered 
(for the purposes of introduction) as the multiplication of an 
ordered array of column vectors by a matrix. The result must of 
course be an ordered array of column vectors which is merely the 
resultant matrix. 

To illustrate this let us compute the product A • B written as 


A -B 


An Ai2\ 

Bn 


B 12 

A 2 i A 22 J 

B 21 


B 22 


( An A 12 
A 21 A 22 


/Bn B 12 

yB 2 i B 2 2 


According to our rules of multiplication of a column vector by a 
matrix this becomes 


A • B = 


AiiBn +A 12 B 21 

A21B11 + A22B21 


A11B12 -I- A12B22 
A21B12 +A22B22 


/AiiBn + A12B21 

| A11B12 + Ai 2 B 2 2 N ' 

\A21Bn +A22B21 

J A22B21 + A 2 2 B 22 y 


We can dispense with this intermediate viewpoint now and pro¬ 
ceed to the general rules governing the multiplication of the two 
matrices A and B. 
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Definition of the product of Two Matrices A and B (C = A • B). 



^AjiBu + A12B12 
lA 2 lBxi + A22B21 


A11B12 + Ai 2 B 22 ^ 
A21B12 + A22B22 ) 


Note that the elements of the product matrix are obtained by 
multiplying a row in A by a column in B. 

As an example let us compute the product matrix of two partic¬ 
ular square arrays. 


Let 

and 

Then 

A • B = 


A = 


B = 


( 1 

(2 l' 
( 1 -2 


3 ' 

0 




2 + 3 ; 1 - 6> 

8 + 0 ! — 4 — 0 J 


It is well to recognize the values assigned to the elements for our 
special problem above. We have taken a problem in which 

An — 1 A12 = 3 Bn = 2 B22 — 1 

A21 = —4 A22 — 0 B21 = 1 B22 — —2 
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We can now relate the resulting algebraic relations just obtained 
to the components of the resultant matrix C . 

Note that the product of two matrices is itself a matrix. Therefore 
our problem consists of relating the elements Ai m and B sr to the 
elements of the product matrix Ckj. 

Symbolically we write the product as 

C = A • B 


To obtain the elements of C we multiply rows in the first matrix 
by columns in the second matrix. 

The elements of the resulting sums consist of products of elements 
with like joining indices. The outer indices serve to designate the 
position in the final product matrix. 

To show this let us reconsider our first example, indicating the 
multiplications which give the one-one position in the product 
matrix. 

We have called the product matrix 


Thus 


C = A • B 


C u = An Bn + A12 B21 


Notice that each term in the sum has outer indices correspond¬ 
ing to the indices on the C element. Again notice that the inner in¬ 
dices of each product are alike, and that the number of terms in the 
sum runs over all possible values of the indices. In this case the 
matrices are 2 x 2 , therefore there are but two terms in the sum. 

Let us now reconsider our element 

Cu = An Bn + A12 B21 


This sum can be written in a more abbreviated manner as 


k = 2 

Cu Aik Bki 

k=i 
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It is perhaps interesting to use the summation convention in this 
particular example 


k = 2 

Cn = ^ Aik Bki = Bki 

ksl 

i.e., the existence of the repeated inner index k indicates that this 
index should be summed over all possible values of k. 

There are four components in the product matrix C. We desig¬ 
nate any one of the four by the general symbol 

C jra 


where j can be 1 or 2 and m can be 1 or 2 . 

Thus 

lc = 2 

Cjm — Aji Bi m + Aj2 B2m — ^ ^ Ajk Bkm = Ajk Bkm 

k = l 

As an example we can write out the expansion of C21 by taking 
the case in which j = 2 and m = 1 



C21 — A21 Bn + A22 B21, or 

B11 B12 


B21 B 2 2, 


A21 Bn + A22 B21 


The Inverse Matrix 

Up to this point we have considered only that transformation 
which expresses x/ as a linear combination of the coordinates xi 
and X2. 

An equally important transformation is the rotation which carries 
the coordinates x\ and X2' back to xi and X2. 

42 


This is the inverse rotation opera¬ 
tion labeled by the symbol S -1 with 
elements Sj m \ 



From the first diagram, 
xi = (cosa)xi' + ( —sina)x2' 
xi = Sn'xi' + Si 2 'x 2 ' 

From the second diagram. 
X2 = (sina) xi' + (cosa) X2' 
X2 = S21 / Xi' + S22 / *2 



Both of these relations could have been obtained by solving for 
the x’s in the two simultaneous algebraic expressions, 


xi' = (cosa)xi + (sina)x2 
X2' = (—sina)xi + (cosa)x2 

The components Sjk' form the elements of the matrix S -1 , the 
inverse rotation matrix . 


and 


Sn' = cosa S12' = — sina 
S21' = sina S22^ = cosa 


( cosa 
sina 



We have defined the product of two matrices A • B by showing 
that an element of the resulting matrix could be written 


( A • 15 )lm — ^ ] Aik Bkm — Aik Bkm 

a ||k 

We shall now discover that the product of S " 1 and S has a 
unique form . 
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s -1 • s = 


cosa -sma 


cosa sina 


. sma 


cosa / \ — sina cosa > 



the unit matrix 


The reader should work this out in detail to check the result. 
Another way of writing the components of • S is 


( S 1 • 5 )jm — ^ , Sjk Skm — 5 j m 


Where 8 j m is the Kronecker delta. 

To remind the reader this symbol has the value ONE when “j” 
is equal to “m”, the symbol has the value ZERO when “j” is not 
equal to “m”. 

1 (j = m) 

Ojm = 

O(j^m) 

The linear algebraic equations specifying the inverse rotation 

xx = (cosa)xi' + ( —sina)x2' 

X2 = (sina) xi' + (cosa) x 2 ' 

can be written as a matrix product; 


Xl 

/S„' S 12 '\ 

Xl ' 


Sn'xa' 

+ Si2' x 2' 

X2 

' \S 21 'S 22 7 

X2 ' 


S 21 ' Xl ' 

+ S22 x 2' 


It is of some interest to see that our first rotation could have been 
multiplied by S _1 to give this result which we obtained geo¬ 
metrically. 
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s _1 - 

Xl ' 

= S" 1 - s 

Xl 

= I • 

Xl 

_ 

Xl 


X2 '. 


_ X 2 


_ X 2 _ 


_ x 2 


Symbolically this can be written 

S " 1 t' =S” 1 . S - r = I - r = r 


* The operation S followed by the operation S 1 corresponds to 
the series of rotations shown below. 



If the magnitude or determinant of the matrix A is not zero we 
define the inverse of A by the product 

A • A " 1 = 1 

where I is the unit matrix. 

The elements A n k' of A -1 can be obtained from the definition of 
the Determinant. Consider A • A -1 = I ; or 


^ A mn Ank — ^mk 

n 

The determinant of A is defined by 

Det A = ^ (— 1 ) m+n Amn minor A n 


or 


s mk = 2(~ 1 ) k+n A « 


min A kn 
Det A 
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From our definition of the inverse 


Smk — / , A mn Ank 


thus 


Ank , _ (- 1 ) k + n minor A kn 
Det A 


In order to demonstrate a practical application of this elegance, 
consider as an example of such a calculation the matrix, 


Now 

Det A = 



3 

I 0 


2 

1 


-1 1 

-1 

0 

2 1 

2 2 

= 3- 

2 

2 

-1 • 

0 2 


+ 0 


= 3(4 - 2) -1 (-2) = 6 + 2 = 8 

The reader can check this result by the standard diagonal multi¬ 
plication of a 3 x 3 determinant. Remember that all determinants 
of order higher than 3 must be expanded by the method of minors. 
Notice at this point that the minor of 


An is 


2 1 
2 2 


= + 2 


The minor of A 23 is 


3 1 
0 2 


+ 6 


3 

1 

0 

-1 

2 

1 

0 

2 

2 


3 

1 

0 

-1 

2 

1 

0 

2 

2 
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Thus 


An'= (-l)i + i 


and 


A 32 '= (-i ) 3 + 2 


min An _ 
Det A 


+ 2 = 1 
8 4 


/ 


min A 23 _ 
Det A 


6 

8 


3 

4 


The reader, as an exercise, can now demonstrate that 



and in addition 



'8 0 0 


1 


-3 =2- 0 8 0 

7/ 8 \0 0 8. 


= 1 


The Orthogonal Property of S 

As indicated previously the interchange of rows and columns of 
a matrix A, forms the transposed matrix A 


If 


then 


A = 


A = 



A 12 A‘22, 

Another way of writing this is to state that 


Ajm — A m j 
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By examining our rotation matrices S and S 1 we find that 



with the result that 

S = S” 1 

The transpose of S is equal to the inverse of S. This is a unique 
property which is the case for a very special class of matrices. 

Such a matrix is called an orthogonal matrix. An orthogonal 
transformation is one for which the following relation holds, 

SS = 1 

remembering that 

S" 1 - S = 1 

Also note for future reference that the magnitude of an orthogonal 
matrix is 1. 

One important consequence of the orthogonality of the rotation 
matrix is that it does not change the distance between two points 
P and Q. A more elegant way of saying this is to state that the dis¬ 
tance between two points is invariant under an orthogonal trans¬ 
formation. 

To illustrate this let us consider the distance d from the origin O 
to the point P (xi, X 2 ). 


d 2 = xi 2 + x 2 2 

We can manufacture d 2 by a matrix multiplication by specifying 
that when multiplying by a position vector r from the left we must 
use the transpose of r which will be a row matrix. 
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d 2 = r • r = [xi x 2 ] M = xi 2 + x 2 2 = a scalar 

/ 

To prove the invariance of d 2 under rotation we shall show that 
d 2 = xi 2 + x 2 2 = xi' 2 + x 2 ' 2 
Remember that the inverse rotation is 


r cosa — sina > 
, sina cosa > 


xi 


,2' 


or r 


= S 


-1 


and then the transpose matrices are 
[xi, x 2 ] [xi', x 2 '] / cosa sina' 


I or r 


= f. s~ l = ?• s 


l — sina cosa> 


Since 10 


d 2 = 


[xi, x 2 ] 


xi 

x 2 


= r' • S • S _1 • r' = r' • 1 • r' 


substitution provides the relation 

[xi', x 2 '] / cosa sina \ / cosa — sina N 


d 2 = 


A- 


— sina cosa / \ sina cosa 


xi 

X2' 


[xi',x 2 '] 


and 


xi 

X2' 


d 2 = Xl ' 2 + x 2 ' 2 


This completes our proof. 

10 The fact that S -1 = S is used in these equations. This relation is an equivalent 
expression of the orthonality. 
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Special Relativity 

In the previous development concerning rotations and matrices 
we were able to establish a set of formal rules for dealing with linear 
transformations. As a particular problem the orthogonal two 
dimensional space rotations were investigated in detail. 

The field of relativity in mathematics and physics pertains to 
space-time transformations. From an abstract point of view the 
reader with the preceding work as a background might ask: 

1) “Are the space-time transformations linear?” 

2) “If so, are these transformations orthogonal?” 

This is a standard game in the sciences, and this is a game of ex¬ 
trapolation. In other words, if one field of investigation is found to 
be described by a compact set of mathematical rules; then the first 
analysis of an adjoining field might well be attempted with the 
established rules. 

Some criticism can be leveled at an attempt to discuss such an 
extensive field as that of the special relativity. Certainly the laws of 
transformation of the special relativity can be developed from com¬ 
pletely physical arguments, and perhaps one can claim a better 
understanding of the subject after such a development. 

Our purpose is to indicate that the rules for special relativistic 
transformations can be obtained by the simple assumptions that: 

The speed of light is a constant in all frames of reference moving at 

constant relative velocities: and that the transformations from one 

frame of reference to another are orthogonal. 

In the sections to follow 
space-time transformations will 
be handled in the same man¬ 
ner as spatial rotations. Let us 
inspect the elementary aspects 
of the space-time diagram. 

Consider an object O (an 
expensive vase if you wish) of 
mass M dropped from the cor¬ 
ner of a very tall building at 
time t = O. The object falls 
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x and t 


with a constant acceleration g, 
and at any later time t it has 
dropped a distance, 

x = - gt 2 

We can plot the distance of 
fall x against the time of fall t. 
The resulting graph is a para¬ 
bola if the x and t coordinates 
are drawn mutually perpen¬ 
dicular. There is no reason at 
this point which demands that 
be orthogonal. Just as permissible a plot could have 



been constructed if we had set the x and t axes up with an acute 
angle between them. 

Remember that we used orthogonal axes in configuration space 
for the convenient reason that the squares of the distance between 
points could be represented as the sum of the squares of the pro¬ 
jections of the intervals between points on the axes. 

It will prove convenient to represent the square of the space-time 
interval between two events as the sum of the squares of the sepa¬ 
rate space and time intervals. 

Notice, we suddenly have introduced the word “event.” An 
event is a point in the space-time diagram. 

In the diagram below a stationary ship at xi = 0 sends out a 
light signal at t = 0 along the positive xi axis. This initiation of a 
light signal is 


‘an event” at xi = 0, and t = 0 


^ LUGMT_jSlGMAl- , 

] 



/,« o 




> X| <A'V*cViov\ 


The observer at xi = X receives the signal at a time t = T. The 
reception of the signal is 


“another event” at xi = X, and t = T 
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If we denote the speed of propagation of light as “c” then 


c = X/T, or X = cT 


These two events can be plotted upon a space-time diagram. 

Note that the square of the length 
of the straight line interval between 
the two events is 



X 2 + T 2 

Since the dimensions of the two 
quantities are different we can write 

X 2 + c 2 T 2 


as the square of the length of the interval. 

This description of the interval must be further altered. 

We proceed by postulating the Principle of Relativity. 

Consider two coordinate systems O and O', with O' moving rela¬ 
tive to O with a constant relative velocity V along the xi axis. 




O 


x 2 i 


o 



at ii 






x.'a 




© 


at ii 


X 
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The Principle of Relativity states that it is impossible by a 
physical experiment to label any one coordinate system as intrin¬ 
sically stationary or absolutely moving with a constant velocity. 

One can only detect the presence of relative motion between the 
two systems. 

Consequence! 

All physical laws have the same form for systems moving rela¬ 
tive to one another with a constant relative velocity. As an ex¬ 
ample the equations governing the propagation of electromag¬ 
netic waves have the same form. 


Further Consequence 

The speed of propagation of light is the same for all uni¬ 
formly moving systems, and no effect or signal is propagated 
with a speed greater than the speed of light. 


This principle will seem to lead to a series of physical paradoxes. 
To explore the results of our postulate let us again consider two 
frames of reference O and O' moving relative to one another at a 
constant velocity V directed along the xi axis. 

Let us assume that the time-variable in O will be labeled by t 


and that the time-variable in O' will be labeled by t'. 

At t = O and at t' = O we can assume that O and O' coincide 
and that a light pulse is emitted which diverges from both origins 
in a spherical wave. 




6 ES 

XtGHT PouS>£ 

£,MtTTJ&X> 


j: 
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An observer stationaiy in O sees the light pulse at t as a circle of 
radius ct. 

An observer stationary in O' sees the light pulse at t' as a circle 
of radius ct'. 

Because the light pulse is observed as two different circles one of 
radius ct and the other of radius ct' we can conclude that the time 
intervals in O and in O' are not the same. 

Let us plot the path of the light signal along xi in O and in O' 
as a series of events. 



If the light pulse is at x p at time t p and at x q at time t q then 


i.e., 

or squaring, 


(x p - x q ) = c(t p - t q ) 
distance = speed X (time interval); 

(Xp - X q ) 2 _ 

(tp — t q ) 2 


P' and Q' are the corresponding points in the space of O'. Be¬ 
cause the speed of the light pulse is the same in both O and O' we 
write 


q 2 — ( X P x q ) 2 _ ( X P x q) 2 
(tp “ t q ) 2 (tp — t q ) 2 

In our development of spatial rotations we were actually inter¬ 
ested in transformations which left the length elements invariant. 
We now look for the invariant quantity in the transformation between 0 
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and O'. From the preceding equation relating to the square of the 
velocity of light to the space intervals and time intervals we find that 


(x p - X q ) 2 - C 2 (t p - t q ) 2 = (x p ' - Xq') 2 - C 2 (t p ' - tq') 2 = O Or 

(x p - Xq) 2 + (ictp — ictq) 2 = (x p ' - Xq') 2 + (ict p ' - ict q ') 2 = O 
where 

i = \^T 


If we now redefine the time variable as 

x 2 = ict and the space variable 

x receives the subscript 1, i.e., xi. 

The equation equating events in O and O' becomes 

(xip - Xiq) 2 + (x 2p — x 2 q) 2 = (xi p ' — Xi q ') 2 + (x 2p ' - X 2q ') 2 = O 

Now consider any two events M and N the principle of relati¬ 
vity suggests that the distance between the two events expressed 
in the variables xi and x 2 is invariant under a linear transforma¬ 
tion. Let the interval between any two events M and N be called 
(r m ~ T n ), then 

(T m - T n ) 2 = (x im - Xin) 2 + (X 2m - X 2n ) 2 = 

(Xlm 7 — x ln ) 2 "h (x 2m X 2n ) 2 


By reexamining the problem of the light signal we see immediately 
that if two events are connected by a light signal, the interval be¬ 
tween these events is zero. This result arises from the particular 
choice of the coordinate x 2 as ict. The important aspect of this re¬ 
sult is not the magnitude zero but rather the invariance of the 
quantity (T m — r n ) 2 under an orthogonal transformation. 

To obtain the Lorentz transformation of the special relativity it 
is sufficient to require that the transformation between (xi, x 2 ) and 
(xi', x 2 ') be orthogonal. 
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Imagine two observers O and O' moving relative to one another 
with a relative velocity V directed along the xi axis. At t = O and 
t' = O again assume that the origins O and O' coincide. At a later 
time t and t' system O^has moved to a point xi = Vt. 



Suppose now that observers O and O' detect an event (an ex¬ 
plosion?) P occurring at x p and t p in O and at x p ' and t p ' in O'. 

O sees the event at t p + x p /c 
O' sees the event at t p ' + x p '/c 

According to our hypothesis of a linear orthogonal transformation 
we can relate the two observations by 


and 


xi' = Snxi + S12X2 


x 2 f = S21X1 + S22X2 


To solve these equations for the elements Sjk we utilize a special 
case in which event P occurs at x p ' = O (or directly over the ship); 
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Then: remembering that the position of the ship is at xi — Vt 


or 


xi p ' = O = Su Vt + S 12 (ict) 


/ 


= + iv/c 


311 


. Because we assume that the transformation is orthogonal (this is 
a statement regarding the invariance of the interval At under the 
transformation), 


or 


S • S = 1 

2 

^ j SjkSkm = 8j m 


Expanding we find that 

Sji§im 4* Sj2S2m — 5j m 
Taking specific values of j and m, 

S n Su 4- S 12 S 21 = SnSu + S 12 S 12 = 1 
or 

(Su) 2 + (S 12 ) 2 = 1 


Solving for Su from S 12 = 

H)- 


(Sn) 2 

(1 - V2/C 2 ) = 1 

or 

Su = 

1 


[1 _ V2/ C 2]l/2 

and 

S 12 = 

+ iv/c 


[1 - V2/c 2 ] 1/2 
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Using the orthogonality condition further 

(j = 1, m = 2, and S 12 = 0) 

S11S12 + S12S22 = S11S21 + S12S22 = 0 


Thus 

5 2 1 = -S X2 = _ j V/c 

5 22 S n 

Finally 



S 21 S 12 + S 22 S 22 = 1 

giving 

S - 1 

' [1 - V2/c 2 ] 1/2 

and 

e - >V/c 

021 — - 


[l-V2/c 2 ] 1/2 


When we substitute these values back into the two linear 
algebraic equations representing the transformation from (xi, X 2 ) 
to (xi', X 2 '), we obtain 


xi = 


1 


[1 _ V2/ c 2]l/2 


| x, + i ? X 2 (=[i-vW]^ {x,_v,) 


and 




Upon clearing the last of these two equations of the term i = \/— 1 
we obtain 


t f _ _ (V/c 2 ) xi + t 
VI - V 2 /c 2 
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This is the Lorentz transformation of special relativity. Our 
purpose here has not been to discuss the physical basis for these 
equations as much as it has been to show the manner in which a 
method developed for the two dimensional Euclidean Space could 
be applied to a new problem, in this case space-time. 

In a manner of speaking we have played an intriguing game. We 
have asked the question: “Can the matrix method of spatial rota¬ 
tions be applied to space-time transformations”? The answer for¬ 
tunately is yes. 

Our method is somewhat different from the customary approach. 
Ordinarily the Lorentz-Transformation is obtained by a physical 
argument. 

The Lorentz Transformation of the special theory of Einstein was 
first developed by Lorentz in a manner similar to that used here. 
Einstein’s contribution was to provide a physical description of the 
equations. We cannot neglect however the insight of Lorentz in 
first noting the transformation. 

Once we have obtained the Lorentz Transformations for x and t, 
we can predict some of the physical consequences of the special 
relativity. 

The time intervals (in two frames moving relative to one another 
with a constant velocity) are not the same. To illustrate this let us 
consider the two frames O and O' shown below. 
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The clock O measures the time t. The clock on O' measures 
the time t'. Assuming that both clocks were synchronized to read 
zero time when the origins O and O' coincided let us investigate 
two events Pi and P 2 observed by both O and O' (consider that P 2 
happens later than Pi but at the same point in O, i.e., at fixed xi). 

The observer at O records event Pi at the time ti and event P 2 
at the time t 2 . 

The observer on O' records event Pi at the time ti' and event 
P 2 at the time t 2 '. 

The observer at O finds that the time interval between Pi and 
P 2 is given by 


At = change in time = t 2 — ti 
The observer at O' considers that the time interval is 

At' = t 2 ' - ti' 

When we utilize the fact that the point at which Pi and P 2 oc¬ 
cur in O is fixed in O, then using the relation between t', xi and t, 
we find that 


At' = t 2 ' - ti' = 


t 2 - ti _ At 
V 1 - VVc 2 “ V 1 - V2/c 2 


This is called time dilation, i.e., At' is greater than At. 

If the two observers in O and O' compare time intervals, the ob¬ 
server in O' concludes that the clock in O is running slow. In 
other words when the observer in O claims one hour between 
events the observer in O' finds that more than one hour elapses by 
his clock. Remember this comparison is conditioned by the fact 
that the events occur at a fixed point in O. If on the other hand 
the events occurred at a fixed point in O' the claims as to slowness 
of the clocks are reversed. 
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Time dilations are 
observed in the labora¬ 
tory. One very spec¬ 
tacular example is that 
of the decay of the 
j li meson. 

The mu minus (sym¬ 
bol fx~) meson has a 
mass approximately 
200 times the electron 
mass and has a charge 
equal to that of the 
electron. The /x“ meson 
is unstable in that it 
decays after a time in¬ 
terval At into an elec¬ 
tron (e“) and two neutrinos. The neutrino is a particle of zero 
mass and is denoted by the symbol v . 

Symbolically the decay is represented as shown below: 


+ v + v 


V 


The lifetime At of the /a“ meson at rest can be con¬ 
sidered as fixed in the frame of rest of the /a” meson. 

The position of the events is therefore fixed in the 
frame of reference of the \i~ meson; xi(t 2 ) = xi(ti). 

The half-life can be measured by measuring the 
number of electrons emitted from a beam of mu 
mesons in a specified time interval, At = t 2 — ti. 

Time dilation is exhibited by the observed fact that the half-life 
of the mu meson in flight (i.e. moving) is longer than the half-life of 
the stationary mu meson (i.e. at rest in the laboratory frame). 
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Consider a segment of a fx~ 
meson beam as the frame O. In a 
time At, N electrons are emitted. 
In the laboratory frame O' (mov¬ 
ing backward) these N electrons 
are counted in a time At' which is 
greater than or equal to At de¬ 
pending upon the relative ve¬ 
locity V. 

Experimentally one finds that 
the stationary (relative velocity zero) mu meson has a shorter half- 
life than the moving mu meson. If the relative velocity of the mu — 
meson with respect to the laboratory is V then the life-time At of 
the stationary mu is (1 — V 2 /c 2 } 1/2 times the lifetime At' of the 
moving mu as observed in the laboratory. 


At' 


At 

Vi - V 2 /c 2 
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CHAPTER 2 


Vector 



A. Introduction 


MANY OF THE QUANTITIES with which we 
are adequately described by the specification 
only. The magnitude of a quantity is a 
classified as a scalar quantity. For example such 
as mass, volume, temperature are scalar quantities. 

Often physical quantities require the specification of three num¬ 
bers to describe them adequately. These three numbers as we will 
see shortly can refer to a directed line segment (in configuration 
space). When the distance d between two points P and Q in the 
space of E 3 was discussed, care was exercised to treat only the 
square of d. This procedure obviously enabled us to treat the scalar 
distance between the two points without reference to a preferred 
direction of the line segment. 



Quantities such as position 
relative to a specified origin; 
velocity; acceleration; and force 
are all directed quantities and 
as such are vectors . 
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B. Definition of the Vector 

Let P be a fixed point in the Euclidean space E 3 (or E 2 ). Then 
by a vector A at P we mean simply a straight line segment drawn 
from P to some other point Q with the direction indicated toward 
Q. We sometimes indicate this vector by PQ instead of A. 

If Q lies upon P, or P = Q; then the vector is a null vector or 
zero vector. 


Since the same line segment could equally form a vector at Q 
pointing in the direction of P, it is necessary to specify the direction 
of the line segment. In drawing diagrams we usually attach an 
arrow he ad t o specify the direction. For example as before the vec¬ 
tor A = PQ at P is drawn as 



The distance from P to Q is called the length or magnitude of the 
vector A and is written 


IA | or | PQJ 


The two vertical lines bracketing a vector indicate that we are con¬ 
sidering the magnitude of the vector. Because the magnitude of 
the vector P($ is equal to the distance between P and Qthe magni¬ 
tude is a SCALAR quantity. A scalar quantity is a more general 
class of objects being either positive or negative. A MAGNITUDE 
on the other hand has the limitation that it is always taken positive. 


C. Properties of Vectors 

In this section some of the general properties of vectors will be 
specified. 
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Two parallel line segments of equal lengths and having the same 
direction are said to be equal vectors . 


A = B in the diagram. 

Two parallel line segments of equal 
lengths but pointing in opposite directions 
have the property that one is the negative 
of the other 


A = —C 

Let A be a vector at the point P. In gen¬ 
eral a vector can be multiplied by a number 
n (a scalar). This operation produces a vec¬ 
tor which has the same direction (or oppo¬ 
site if n is a negative number) as the original 
vector A, but has a magnitude n times as 
great as the magnitude of the original vector. 

The product of the vector A with the 
scalar n is represented by 

nA 






nA is parallel to A if n > O 

nA is antiparallel to A if n < O 

As an illustration consider A, 2A, and 



From the definition 

I n A | = | n | | A | 

stipulating that the magnitude of a vector 
is always taken as a positive number. The 
magnitude sign about the scalar “n” re¬ 
moves any negative signs. 
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If m is any other number, we can form the product of m and nA 
in any combination that we desire 

m (nA) = (mn) A = (mnA) 


The Unit Vector 


If in forming nA we take n to be 


n = 


1 


|A| 


assuming that A^O; then the length of the resulting vector after 
multiplication is one. 


1 A = A 


I A | I A | 

This vector has the same direction as A however 


= !Ai = i 
I A | 


IA | 


The unit vector is dimensionless. 


Addition of Vectors 

The operations to follow will involve two or 
more vectors; instead of a vector and a number. 
In these operations involving more than one 
vector the vectors must all be located at the 
same point of E3 (or E 2 ). 

For example it does not make sense to add 
a vector at P to a vector at a different point P'. 

Let A and B be two vectors located at a 
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point P. We are going to define a third vector at P called the sum 
of A and B, and indicated by A + B. 

To do this construct the parallelogram having A and B as its two 
side s and let Q be the vertex of the figure opposite from P Then 
PQ is by definition the vector A + B. This is illustrated in the dia¬ 
gram below. 



Of course if A and B happen to lie in the same or opposite direc¬ 
tions, then it is not possible to construct the parallelogram. In such 
a case A + B is defined in the most obvious way: If A and B are 
parallel, then A + B is defined by the vector of length | A | + | B | 
in the common direction of A and B. Note in the diagram below 
how this definition is consistent with the parallelogram method as 
the parallelogram collapses into a line. 



If A and B are antiparallel, and if say | A | > | B | then A + B 
is the vector of length | A | — | B | in the direction of A. Again this 
definition is consistent with the parallelogram method. 



1. Vector Addition obeys the commutative law, i.e., the order in 
which the vectors are added does not alter the final result. 


A + B = B + A 
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2. Also Vector Addition obeys the associative law, i.e., the grouping 
in the addition of three or more vectors does not alter the final result: 


R = (A + B) + C = (C + A) + B = A + (B + C) 



3. Vector Addition obeys the distributive law, i.e., a scalar times the 
sum of two vectors is equal to the sum of the scalar times each vector. 
Let n be a scalar, then 


n(A + B) = nA + nB 


If A is different from zero and, 

nA = O 

Then n must be zero. 

The length of resultant A + B can 
be obtained from the law of cosines. 



|A + B| = {A 2 + B 2 - 2 | A11 B | cos(77 - 0)} 1/2 


where 6 is the angle between A 
and B. 

The difference of two vectors is ob¬ 
tained by the rules of addition. 

Consider the vectors A and B, and 
compute A — B. The difference can 
be written A + (—1) B. The multiplication of B by (— 1) produces 
a new vector antiparallel to B but of equal magnitude. 
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The relative velocity problem can be utilized as an example of the 
addition of vectors. The velocity of a moving body is a vector at any in¬ 
stant of time because the velocity has both direction and magnitude 
(speed). 


Velocity is always defined relative to a given coordinate system. For 
instance we can stipulate that a ship moves southeast at say 20 miles 
per hour. The coordinates are the spherical grid laid out on the 
earth’s surface. 


For convenience let us take the 
direction south as the x axis 1 and 
the direction east as the y axis. 1 Then 
V s can be represented graphically 
as shown to the right. 

Since velocity implies the motion 
of a specified object (a ship in this 
case) relative to a specified frame 
(the earth’s surface in our example) 
we can label the vector representing 
the velocity V s in such a manner 
that this information is seen graph¬ 
ically. We place the name of the 
moving body at the head of the vec¬ 
tor, and we label the tail of the vec¬ 
tor with the frame of reference. 

In our problem involving the ship 
we label the vector as shown. 

Now suppose we are given the 
velocity of the wind relative to the 
ship and we state that the wind 
blows to the Northeast at a relative 
velocity of 10 miles per hour (de¬ 
note this velocity as v w ). 

1 This convention is chosen to correspond to 
the system of base vectors in spherical 
coordinates. 
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We see that the velocity of the 
wind relative to the earth V w is 
given by the equation, 

Vs + v w = V w 

or as it is usually presented 

v w —■ V w V s 

This approach can now be ex¬ 
tended to a more complicated prob¬ 
lem. Consider our ship in the 
problem as ship 1 and now assume 
that we have a second ship moving 
due East at a speed of 25 miles per 
hour. 

We can ask the question: “If the 
wind velocity relative to ship 1 is 
given (v W |)> what is the velocity of 
the wind relative to ship 2, v W2 ”? 
The solution is obtained from the 
vector diagram. 


In this case the ship is the 
frame of reference. Thus the 
vector giving the relative wind 
velocity is drawn below. 

Up to this point we have de¬ 
fined symbols. What problem 
can be solved utilizing these 
symbols? One problem is the 
determination of the velocity 
of the wind relative to the 
earth. 

The solution is simple when 
we add the vectors in such a 
manner that the labels coincide. 

&ARTH 



fcARTH SH1P*2 
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D. Multiplication of Vectors 

The projection of one vector upon another 


/ 


Let A and B be two vectors at P. We assume that A ^ O. 

The projection of B upon A is defined to be the vector aA, where 


a = 


IB | 


cos 0\ 0 being the angle between A and B. 


The projection aA has the same direction as A if a > 0 (i.e., if 
cos 0 > 0, and 6 < 90°). The projection has the opposite direc¬ 
tion if a < 0 (i.e., 0 < tt and 0 > 90°). The two cases are illus¬ 
trated below. 



The Scalar or Inner Product 

The Scalar Product of two Vectors A and B can be defined in 
terms of the magnitude of the projection of one vector on the other. 

The scalar product between A and B is denoted by placing a dot 
between the two and is defined as 

A • B = | A | | B | cos 6 'p 

0 is the smallest angle between A 
and B. 

The dot or scalar product gives 
a scalar (not a vector). 
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This multiplication is convenient to define the magnitude of a 
vector or a sum of vectors. We find that 

| A | = y/ATA 

the positive square root of the scalar product of a vector with itself 
is the magnitude of the vector. 

Using this relation we can show that 

|A + B| = V(A + B) • (A + B) = V |A| 2 + |B| 2 + 2AB 

The expression above is a restatement of the 
law of cosines. 

| A + B| 2 = | A | 2 + |B| 2 + 2 | A | | B | cos0 

The order of multiplication of the vectors in the scalar product does 
not affect the result. The scalar product of two vectors commutes. 

AB = BA 



If “a” is any number, then 

(aA) • B = A • (aB) = a(A • B) 


If C is another vector at P, then 

A(B + C) = AB + AC 


It follows easily from the definition that the projection of B on A is 
(see the previous section) 


aA = 


(A'B) 

IA | 2 


A 


The work done by a force F acting through a distance is defined 
in terms of the scalar product between the vector F and the straight 
line distance designated by the vector 1 
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Wi = F • 1 


If we consider a body moving 
between points P and Q in a 
straight line and subject to a con¬ 
stant force F, 

then 

Wj = F • 1 

In practice we need to know 
the total work done on a body 
moving along a curve (not in 

general a straight line) and sub¬ 
ject to a force which varies in 
direction and magnitude from 
point to point on the curve. A 
typical case is illustrated below. 

In this illustration we break the 
curve into a finite number of seg¬ 
ments and draw the force at each 
point of division. 

We can obtain an approximate 
value for the work by taking the 
sum of the scalar products of the 
force at the beginning of an inter¬ 
val with a vector cord which 
approximates the interval. 
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Consider the 6 —» 7 interval 


VV 6 —>7 = F6 • ^6,7 


The approximation of the total work is obtained by summing over 
all of the intervals. 


9 

W To tal — * ljj-l 

j=i 

The approximation becomes better as the intervals are taken 
smaller in length and as the total number of intervals is increased. 
The proof of this statement is contained in the fact that the cord of 
a small segment approximates its corresponding arc to a higher de¬ 
gree than does the cord of a larger segment. 

Later in the discussion of the integral calculus we shall show that 
the total work is obtained exactly if we let the length of the inter¬ 
vals approach zero while the number of intervals increases in an 
appropriate manner. 


I'* 

Wxotai = limit V' Fj • l jJ+ i 

i-° “v 

N—»ao 



The Cross Product or Vector Product 

All of the vector operations which have been discussed up to this 
point can be performed in either E 2 or E 3 . The operation which 
carries the designation “vector product” applies only to vectors in E 3 . 

We shall associate with any two vectors A and B (at a point P of 
E 3 ) a third vector A X B at P. This vector, A X B, is called the 
cross (or Vector) product of A and B. This operation is somewhat 
more complicated than the operations defined previously in that the 
concept of a right handed system must be invoked. 
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A vector A and a vector B form a 

plane. 

The angle 6 between A and B is 
measured in the plane of A and B. 

Suppose for the moment that 8 ^ 

180° or 0 °. 

The perpendicular to the plane of A and B is a line character¬ 
istic of the combination A and B. 




We define the cross product in terms of the properties of the 
combination A and B which are listed above. 
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The magnitude of the vector product of A and B is defined as 
the magnitude of A times the magnitude of B times the sine of the 
(smallest) angle between them, 

|A X B| = |A| |B| sintf 

The direction of the vector product is taken perpendicular to 
the plane defined by A and B. We have two choices for the direc¬ 
tion and the choices are of opposite sense. 

The direction is defined by a right hand convention. Start with 
the fingers of the right hand pointing in the direction of the first 
vector A. The fingers are then rotated toward the second vector B 
by closing the fist. The thumb then indicates the direction of the 
cross product. 


C = A X B 

| C | = | A | | B | sin 8 


If we utilize our previous definition 
of a right handed coordinate system 
the direction of the vector A X B is 
determined. We align the first vector 
(A in the cross product) with the x 
axis. The plane of A and B is oriented 
in the xy plane. In such a case the vec¬ 
tor A X B is determined by measur¬ 
ing a distance | A | | B | sin 6 along 
the positive Z axis. 

Having defined the direction A X B 
we now find that the order of the 
multiplication is important. In fact, 
using our right hand rule we see that 
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an inversion in the order changes the 
sign but not the magnitude of the cross 
product. 


B X A = -A X B 
and |C| = |B X A| = |A X B| = 

| A | | B | sin 6 

The cross product (or vector prod¬ 
uct) has many applications in science. 
In order to illustrate the great utility 
of the cross product a discussion of 
mechanical angular momentum will 
now be presented. This will require 
some digression from the mathematics 
and in addition will require the intro¬ 
duction of a few new concepts. 

The Angular Momentum of a mass 
point. 

In the figure to the right an idealized 
system is shown. The entire system 
consists of a set of coordinates with an 
origin O and a mass point m moving 
with a vector velocity v. The mass m 
is located at any given instant in time 
by a position vector r. 

The vector velocity v is related to 
the position vector r. To see this con¬ 
sider the position of m at two different 
times ti and t 2 with t 2 being the later 
time. At ti the position of m is given 
by ri. At t 2 the position of m is given 
by r 2 . 





— 
























Thus in the time interval (t 2 — U) the particle has moved the 
vector distance given by t 2 — rj. 


If the time interval (t 2 — ti) is sufficiently small so that (r 2 — n) 
represents the path taken by the point mass m then the vector veloc¬ 
ity average is defined by the ratio 


V = (II - 111 

(t2 ~ ti) 


Several things should be noted about this result. 

1. The average velocity defined in this manner is that velocity asso¬ 
ciated with a time intermediate between t 2 and ti. Say at a time 


-(t2 + ti) 


2. In general the time interval (t 2 — ti) 
should be very short. To see this consider a 
point rotating at a constant angular velocity 
on the circumference of a circle. 

If (r 2 — ri) the cord of the arc subtended 
between t 2 and ti is to approximate the path 
between t 2 and ti, the cord (r 2 — ri) must 
be approximately equal to the arc subtended 
between 1 and 2. 

This will only be the case when the angle between r 2 and ri is 
very small. 

It should then be apparent that by taking t 2 very close to ti the 
conditions of approximation can be satisfied. 

The vector linear momentum “p” of a mass point is by definition 
the scalar mass m times the vector velocity v; 



p = mv 
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The vector angular momentum L 

relative to O is defined as the cross 
product of the position vector r and the 
linear momentum p; 

L = r X p = r X (mv) 

Notice in particular that L is defined 
at a specific time t and is always de¬ 
fined relative to a specified origin O. 

If the position r and the velocity v 
vary in direction and or magnitude then 
the angular momentum L will vary in 
direction and magnitude. 

In order that this be more apparent 
let us again take the example of the 
point mass moving on the circumfer¬ 
ence of a circle. Also assume that the 
magnitude of the velocity of the point 
is constant. 

This means that the velocity on the 
circumference (called the tangential 
velocity) changes only its direction in 
time. Therefore the magnitude of the 
velocity at ti is the same as the mag¬ 
nitude of the velocity at t 2 . However 
the directions of the velocity vectors at 
ti and t 2 are different. 

Using the diagrams to the right we 
can define an additional kinematic 
vector known as the angular velocity o>. 

The angular velocity has a magnitude 
equal to the change in angle divided 
by the change in time 


M = 


(<j>2 — 4>l) 

(t2 — tl) 































The direction of the angular velocity vector 
is perpendicular to the plane of r 2 and ri, and 
for circular motion to is perpendicular to the 
plane of the circle. The direction is the same 
as the direction of the vector ri X r 2 . 

For the circular problem we find that when 
A<£> is small the length of the cord is approxi¬ 
mately equal to the length of the arc sub¬ 
tended by ri and r 2 . Therefore, if (<£2 — <f> 1 ) is 
small 


|r 2 - ri| =(<h - <h)’ |ri| 


Since (<J> 2 — <f>i) has been defined in terms of 
to we can write 


CIRCULAR ORBIT 


| r 2 — ri | — |«| (t 2 — ti) • | ri | 



The tangential velocity has 
been defined, and the magnitude 
is obtained by dividing by 
(t 2 - ti). 


Kl — 


jr 2 — ril 

t 2 — ti 


M• |r | 


and in general 



v t = CO X r 


As an exercise the reader should show that 
in the case of circular motion the direction of 
to is uniquely defined in this equation if vt and 
r are given. Hint, cross r into vt and use the 
characteristic of circular motion that r and v t 
are perpendicular, then to = r X vt/|r| 2 . 
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In general 


L = r X (mv) 
For circular motion 


v = v t = to X r 


Thus 

L = r X (mto X r) 


A 


L 






Because to and r are mutually perpendicular 


L = mr 2 to 


in other words L and to are parallel vectors. 

It is interesting to note that this result is true for point masses 
only. When the problem relates to a distributed mass such as a 
rigid body (a top, gyroscope, etc.) the instantaneous angular momen¬ 
tum L and the instantaneous angular velocity co are not necessarily 
parallel. 



Consider the 


The Triple Scalar Product 

Multiple products of more than two vectors can 
be created by various combinations of the elemen¬ 
tary operations of the scalar product and the vector 
product. 

One of the well known triple products is that 
known as the triple scalar product. This particular 
form arises when a vector product of two vectors is 
combined with a third vector by means of a scalar 
product. 

vector product between two vectors B and C, 


B X C 
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The vector B X C is perpendicular 
to the plane of B and C. Now take 
the scalar product of (B X C) with 
a third vector A, 

A • (B X C) 


The triple scalar product has several interesting properties. First 
the order can be cycled without changing the sign. 


A • (B X C) = C • (A X B) = B • (C X A) 


If the order of cycling is changed the product changes sign 
A • (B X C) = — A • (C X B) = — B • (A X C) = —C • (B X A) 


The second property is geometrical. If A, B, and C form the 
edges of a parallelepiped, the magnitude of the triple scalar product 
is the volume of that parallelepiped. 

Let 9 be the angle between (B X C) and A. 

n 

B X C = | B | |C| sin g is the area of the parallelogram de¬ 
fined by B and C. 



Area = base X altitude 
= IB| |C| sin^C C 


Now the projection of A on a line perpendicular to the face 
B X C is the height of the parallelepiped. 


Volume = | A | cos 0 • (base area) 
= | A | cos 6 | (B X C) | 
= A • (B X C) 
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The Triple Vector Product 

The vector B X C is perpendicular to the plane of B and C. 



The plane of B and C contains all lines and vectors perpendic¬ 
ular to the vector B X Cat the point P. The vector A X (B X C) 
therefore must lie in the plane formed by B and C. 



The vector A X (B X C) is called the triple vector product of 
A, B, and C. 

In order to form a non zero vector B X C; B and C must 
not be parallel or anti-parallel. Since A X (B X C) lies in the 
plane formed by B and C the triple vector product can be expressed 
as a sum of two vectors which are parallel or anti-parallel to B and 
C respectively. 

Another way of saying this is to state that B and C are linearly 
independent (not parallel or anti-parallel) and form a subspace 
(in this case a plane). 
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E. Resolution Along a Complete Set of Base Vectors 


Introduction 

Theorem: A space can be defined in terms of a linearly indepen¬ 
dent set of vectors and any arbitrary vector in this space can be 
expanded in terms of its projections upon the linearly independ¬ 
ent set of vectors. 

A set of vectors Fi, F 2 ... F N is said to be linearly independent 
if the relation 


j>> F» = O 

n= 1 

is true if and only if all of the a n = O. 

To illustrate this point consider the two vectors B and C and the 
scalars b and c. If 


bB + cC = O 

then, if b and c are not zero, B is parallel or anti-parallel to C. 

bB = -cC 


Therefore when B and C are specified to be linearly independent 
and are not parallel (or anti-parallel) the only possible solution to 
bB + cC = O is that 



c = O 

Under this last condition B 
and C are said to be linearly 
independent. 

Now regard an arbitrary 
vector G in the plane (or sub¬ 
space) defined by B and C. 


The projection 2 of G on B and C is performed just as we pro¬ 
jected a point P on the general coordinate axes in Chapter 1. 

To perform this expansion construct a vector along B and a vec¬ 
tor along C such that 

nB + mC = G 

This operation is an elementary example of the expansion 
theorem given previously. 

After this digression we can see that because A X (B X C) lies 
in the plane (subspace) defined by B and C, the triple vector pro¬ 
duct can be expanded in terms of the vectors B and C. 

We shall quote the result without proof at this point. The proof 
can be performed when we have established the base vectors, how¬ 
ever, it is a straight-forward operation but tedious. 3 

The expansion of A X (B X C) in terms of B and C is called 
the BAC minus CAB formula, 

A X (B X C) = B(A-C) - C(A-B) 

= nB — mC 

n = (A • C) 
m = (A • B) 


A note of warning should be made concerning the brackets about 
B X C in 


A X (B X C) 


2 Note that this is not a projection as described at the beginning of this chapter. 
This is a projection obtained by passing lines through the terminus of G parallel 
to B and C. The intersections of these lines with extensions of B and C deter¬ 
mine the lengths of the projections (See Chapter 1). 

3 We can verify this proof up to a constant multiplier in the following fashion. 
Assume A X (B X C) = nB - mC. Take the scalar product of this equation with 
A and utilize the properties of the triple scalar product. 

A • A X (B X C) = (B X C) • (A X A) = 0 = n(A • B) — m(A • C) 

Then 

n = A • C and m = A • B is a solution. 
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These brackets specify that the cross product of B and C must 
be taken before the cross product is taken with A. 

It can be shown for instance that 

A X (B X C)^(A X B) X C 


Base Vectors 

Let A, B, and C be three non-coplanar vectors issuing from a com¬ 
mon point O. 

If A, B, and C are non-coplanar vectors, they are linearly in¬ 
dependent, and any arbitrary vector F at O can be represented as 
the resultant of the sum of three vectors which are parallel to A, B, 
and C respectively. In otherwords, it is possible to write 

F = aA -b bB + cC 




To find aA + bB + cC we must construct a parallelepiped with 
edges aA, bB, and cC as shown above. 

It is apparent that a suitable choice of a, b, and c will give a rep¬ 
resentation of F provided A, B, and C do not lie in a plane. 

One can stipulate that A, B, and C are non-coplanar (or linearly 
independent) by requiring that A • (B X C) 7 ^ O. 

To construct the parallelepiped we pass planes through the 
terminal end of F, parallel to the three planes defined by (A, B); 
(B, C); and (C, A). The intersection of the plane (B, C) with the 
extension of A provides the length of the vector aA. 
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Orthogonal Base Vectors 

Orthogonal bases of arbitrary length. 

If the three vectors A, B and C at O are mutually orthogonal,* 
the resolution of an arbitrary vector F along them becomes a much 
more straightforward procedure in terms of the vector operations 
which we have defined. 

Consider the three mutually orthogonal vectors A, B, and C and 
an arbitrary vector F. 



The parallelepiped in this case is a rectangular parallelepiped. 
Under these conditions of orthogonality the numbers, a, b, and c 
can be obtained by calculating the projection of F upon A, B, and 
C respectively. 


Regard the triangle OPQ shown in 

the diagram. 




T y 


* One vector is orthogonal to a sec¬ 
ond vector when the two vectors 


X** 

T. 

are mutually perpendicular (i.e., 
the angle between them is 90°). 

o / 

A 
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Previously we defined the projection of one vector upon another 
in terms of the dot or scalar product. 


F-A = |F| |A| cos0a 
aA = | F | cos#a 


IA | 


Thus 


aA—< F-A >A 

IA | 2 A 


In the same manner, 


and 


bB = MB 
IB | 2 

cC = (!^ic 

|C|* 


The vector F resolved along A, B, and C can now be written 



It is important to see at this point that the base vectors A, B, 
and C have been written in a form which provides three unit vec- 
A R C 

tors and -y^y. These unit vectors will be denoted by the 

symbol c. 


Let 

and 


then 
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F = (F • ca)*a + (F * c b)*b + (F # c c)*c 


This process of normalization (converting the bases to unit bases) 
leads us naturally to the next section which deals with the Cartesian 
unit base vectors. 

The Cartesian Bases. 

In the first chapter concerning 
geometry we defined the three dimen¬ 
sional Euclidean space with the co¬ 
ordinate axes x, y, and z and an 
origin O. 

We now set up the three base vec¬ 
tors A, B, and C along the positive 
x, y, and z axes respectively. 

The order of the base vectors is 
taken to provide a right handed 
system. 

The unit base vectors for the Cartesian space have standard sym¬ 
bols. Assuming that A lies along the positive x axis; B along the posi¬ 
tive y axis; and C along the z axis; the unit base vectors are defined 
as 




Actually six quantities are 
required to specify a vector in 

the Cartesian space. We assume that there is a set of Cartesian base 
vectors associated with every point in E 3 . Thus to specify a vector at 
a point P we require, 

1 . The position vector of the point P (3 numbers). 

2. The components of the vector along the bases, i, j, and k. 
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The Cartesian Bases 


Using the i, j, k notation for the three unit base vectors in E3 the 
arbitrary vector F is written. 

F = (F • i) i + (F • j) j + (F • k) k 

The scalar coefficients (F • i), (F • j), and (F • k) are called the 
Cartesian components of the vector F and are written as 

F x = (F • i) = | F | cosa 
Fy = (F • j) = | F | cos/? 

F z = (F • k) = | F | cosy 

cosa, cos/? and cosy are the direction cosines of the vector F rela¬ 
tive to the x, y, and z axes. The components F x , F y , and F z plus 

the direction cosines are illustrated 
in the figure below. 

The vector is now presented in its 
more familiar form 

F = F x i + F y j + F z k 

The notation can be contracted 
by representing F x as Fi, F y as F2, 
F z as F3; in addition we can write Ci, 
c 2 , and €3 in place of i, j, k. Then 


n = 3 

F = F1C1 + F2C2 + F3C3 = ^ F n c n 

n= 1 

This notation conforms to the more general symbolism which 
was used in the first chapter in the section concerning Matrices. 

In fact if we utilize our summation convention for repeated in¬ 
dices the vector can be written as 

F = F n c n 
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The Addition of Resolved Vectors 

M 

Two vectors F and G can be added to form a third vector R; 
F + G = R. Previously we performed the addition by the paral¬ 
lelogram method. At that time we saw that the parallelogram 
method was equivalent to a construction in which the tail of one 
vector is connected to the head of the other and the resultant is ob¬ 
tained by drawing a vector from the tail of the sum to the head of 
the sum. 

Addition of resolved vectors is an algebraic operation whereas 
the addition of unresolved vectors is a graphical problem. 

When two vectors F and G are added, the component of a given base 

vector for the resultant vector R is the sum of the components of F and 

G corresponding to the same base vector. 

The diagram to the right illustrates 
in two dimensions how components of 
like base vectors are summed. 


If R = F + G 
Where F = F x i + F y j + F z k 
and G = G x i + G y j + G z k 

Then 

F + G = (F x + G x )i + (F y + G y )j + 

(F z + G z )k 

R = Rx i H~ Ry j + Rzk 
and R x = F x + G x 

Ry = Fy + Gy 

Rz = F z + G z 

As a numerical example consider the following case: 
Let F = li — 3j + 10k 
and G = ±i + 5j — (2^ k 


1* -’l 
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The sum F + G is then given by 

R = F + G = |i + 2j + (^ k 


Because subtraction is a special case of addition we can write 


F — G = (F x — G x )i + (F y - G y )j + (F a - G z )k 


Products of Resolved Vectors 

The dot or scalar product of two resolved vectors . 

When one considers the dot product of two vectors F and G 
which have been resolved into components along the base vectors, 
i, j, and k the final result is determined conveniently by the scalar 


products of the unit base vectors. 


i- i = 1 

j- 1 = 0 

k- i = O 

i* j = O 

j‘j= 1 

k-j = O 

i-k = O 

jk = 0 

kk = 1 


The orthogonality of the bases leads to convenient relations for 
scalar products. 


F • G = (F x i + F y j + F z k) • (G x i + G y j + G z k) 
= F x G x (i • i) + F x G y (i • j) + F x G z (i • k) 

+ FyG X (j . i) + FyGy(j • j) + FyG Z ( j • k) 
+ F z G x (k • i) + F z G y (k • j) + F z G z (k • k) 


and 

F • G = F X G X + F y Gy + F Z G Z 


note: |F| = y/F^f = + F y 2 + F z * 

It is interesting to observe the compactness achieved in such a 
development when the more generalized notation is used. 
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If F and G are written 


and 


Then 


3 

F — ^ ] F m c m 

m= 1 

3 

G = ^ Gn €n 


3 3 

F G = { 2 F mfm}'{ 2 G “ e “) 

m=l ns] 

3 3 

= ^ ] ^F m G n (c m • c n ) 

m =I nsl 


Now in the case of orthonormal bases 


where 


= S n 


5 mn = 1 if m = n 
= 0 if m n 


Our scalar product becomes 4 


3 3 

F * G — ^ ^ ^ ^ F m G n 5 mn 


m = 1 n = 1 
3 



m = 1 


= FiGi + F2G2 + F3G3 

4 This particularly simple expansion as the sum of the products of like components 
is characteristic only of the orthogonal systems of base vectors. 
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To illustrate a typical scalar product the vectors of the example 
of the previous section can be used again, 

F = i —3j + 10k 

G= 2 i + 5j " G) k 

Thus 

F • G =(1 X 1/2) - (3 X 5) - (7/2 X 10) = 49± 


The vector product 

Once more the orthonormal property of i, j, k provides a con¬ 
venient result when we consider the vector product of two arbitrary 
vectors F and G. 


F X G = F X G X (i X i) + F x G y (i X j) + F x G z (i X k) 
+ FyG x (j X i) + FyGy(j X j) + FyG z (j X k) 
+ F Z G X (k X i) + FzGy (k X j) + F Z G Z (k X k) 


Using the Right Hand Convention the cross products of the unit 
base vectors can be determined. 



i X i = O 

jxi=-k 

k X i = +j 

i X j = +k 

jxj= o 

k X j = -i 

i X k = -j 

j X k = +i 

k X k = O 


Thus our expansion of the cross pro¬ 
duct F X G becomes, 
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F X G = (F y G z — F z G y )i + 
(F Z G X - F x G z )j + 

(F X Gy - FyG X )k 


The cross product expanded in terms of the components of the 
two vectors involved can be expressed in terms of a determinant. 


Fx Fy F z 
G X Gy G z 


* 1. The base vectors occupy the first row. 

2. The components of the leading vector in the product occupy 
the second row. 

3. The components of the last vector in the product occupy the 
third row. 

The proof of the validity of the determinant form can be 
obtained by expanding the determinant and comparing the result 
with the form obtained previously. 

Once again we can utilize the example vectors to demonstrate 
a typical calculation employing the cross product. 

As before 


Then 


F = i — 3j + 10k 
G = l/2i + 5j — (7/2)k 


F X G = 


i j k 

1 -3 10 

1/2 5 -7/2 


F X G = (21/2 - 50)i + (5 + 7/2)j + (5 + 3/2)k 




We should note in closing this section that the triple scalar pro¬ 
duct of three arbitrary vectors A, B, and C assumes a particularly 
convenient form in terms of the components of the three vectors. 
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If we wish to compute 


then 


and 


A • (B X C) 


(B X C) = 


i j 


k 


B X By 

C X Cy 


B z 

C z 


A • (B X C) = 


A x Ay A z 
B x By B z 
C x Cy C Z 


In other words the triple scalar product of three vectors, A, B, 
and C is merely the determinant of their components. The cycling 
procedure can now be seen to be equivalent to an interchange of 
rows. The negative sign which arises when one writes 

A • (B X C) = — B • (A X C) 


can be associated with the sign change which occurs when two rows 
of the determinant are interchanged. 


F. Vector Transformations 

In Chapter 1 the term “vector” was used interchangeably with 
the term “column matrix” (or row matrix). This labeling is per¬ 
missible because the representation of a point in a specified space 
can be accomplished by using either the listing of the coordinates 
of the point in an ordered array (a column matrix) or by represent¬ 
ing coordinates with their associated base vectors. 

The two representations are equivalent. 

The particular representation which one uses depends upon the 
problem at hand and upon the convenience with which the symbols 
can be manipulated. 
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The equivalence of the two representations is illustrated by the 
fact that they possess the same transformation properties. The vec¬ 
tor or linear array of objects is one of that group of mathematical 
objects called tensors. 

The order of a tensor depends upon its transformation prop¬ 
erties. 

Consider the rotations (orthogonal transformations) which we 
have already discussed. 

* Under these transformations 

1 . The Scalar quantity is invariant. The scalar is called a zeroth 
order tensor. 

2. The vector transforms in the following manner: 

r' = S • r 

or 

*j' 

or 

r xi'i _ /Sn Si2\ r xii 
Lx 2J \S2l S22/ L X 2J 

Objects which transform in this manner are called first order 
tensors or vectors. 

3. We should also remark that a matrix A transforms as follows: 



B = S-A - S" 1 


or 


Blm — Akn S nm 


Objects which transform in the manner shown above are called 
second order tensors. 

Our aim is not to provide a complete classification of tensors by 
their transformation properties but rather to indicate that such a 
classification exists and that it should be kept in mind. 
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When a vector is described in terms of its matrix form certain 
characteristics must be considered. To introduce these character¬ 
istics we shall employ a simple example. 

As an example, regard the position vector r which locates the point 
(1, 3, 2) in a three dimensional vector space and which can be de¬ 
scribed by the column matrix 


r = 



or by the vector form 

r = Ci + 3c 2 + 2c 3 


Notice that the matrix representation suppresses the base vectors; 
in other words one must understand that the base ci is associated 
with the first position in the column and that e m is associated with 
the m ^ position. 

Inner products or dot products are taken in the matrix notation 
in the following manner, 


r • r 


[1,3,2] 


14 


or in the vector notation 


r • r = 


^ ^ Xi X k («! * € k ) = y X , 2 
1 k I 


In many cases (such as that of rotations) the transformation of a 
vector may be achieved most easily when the vector is represented 
in its matrix form. 

To illustrate this we can consider the rotations as applied to vec¬ 
tors. This procedure seems paradoxical in the following sense. 


98 


The rotation matrix can be thought 
of as a rotation of the coordinates, 

r' = S • r 

In other words the application of 
S to r (described in terms of the bases 
i and j ) 5 gives the same vector in a 
new set of coordinates i' and j' which 
make an angle <f> with respect to the old coordinates. Stipulating 
this interpretation of S we consider the vector to be fixed in i and 
j and then obtain its components in system i\ j' rotated counter 
clockwise with respect to i and j. 

i • i' = cos <f> 

If the problem deals with a set of moving bases; then the positions 
of the components in the final vector must be associated with the new 

bases . 

Consider the rotation of the bases 

r' = S • r 

or 


■xr 

/Sn s 12 \ 

*i" 

X 2 '_ 

\S 2 i s 22 / 

_ x 2_ 



r = xii + x 2 j 


and 


r' = Xl ' i' + x 2 'j' 


5 We use €i and i interchangeably. This alternate use of the different symbols will 
familiarize the reader with the symbols and their relationships. 
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In this example the base vectors after transformation are not 
the same bases as used in the expansion of the vector in the original 
system. 



The operation S can also be re¬ 
garded as a rotation of the vector in a 
stationary coordinate system. 

From this point of view the operation 


— 


= S • r 

provides a new vector r' represented in 
terms of the original bases. 

Thus S can be thought of as a clockwise rotation of the vector r 
through an angle <J>. 

With such an interpretation the base vectors associated with the 
final vector r' are the original bases i and j. 


with 
and with 


-xr 

/ Su S 12 \ 

‘xf 

,X 2 ' 

\S 2 i $ 22 / 

_ x 2_ 


r = xii + x 2 j 
r' = xi'i + x 2 'j 


This possibility of a double interpretation leads to little difficulty 
in the end result if the questions and conditions of the problem are 
clearly stated. 
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CHAPTER 3 




Analytic 

Geometry 


IN CHAPTER 1 WE mentioned that analytic 
algebraic description of geometrical figures in E 2 and E 3 . Some 
material which is considered a part of analytic geometry has been 
introduced in our discussion of Geometry and Matrices in Chapter 1. 

We also find that a portion of traditional Coordinate Geometry 
has been anticipated in Chapter 2 which concerned the Vector 
Algebra. ^^^\YVTl rx 


The term Analytic Geometry has come to imply the study of cer¬ 
tain elementary figures in E 2 and E 3 , namely, those of the linear and 
quadratic varieties. 


A. Introduction 
















X 







In E 2 the linear figures are straight lines, 
and the quadratic figures are the conic 
sections (ellipses, hyperbolas, parabolas, 
and circles). 

In E 3 the linear equations represent 
the planes, while the quadratic figures are 
the ellipsoids, spheres, parabloids, etc. 


B. Loci 



As an example of the former; the 
locus of all points in E 3 at a fixed 
perpendicular distance R from a 
given straight line is clearly a cylinder 
of radius R with the given line as the 
axis. 



To illustrate the locus defined by algebraic equations; let x, y be 
a set of Cartesian coordinates in E 2 , and let C be the locus of all of 
the points in E 2 whose coordinates satisfy the relation. 



C is clearly a circle of unit radius with its center at the origin, for 
V x2 + y 2 is simply the distance of the point (x, y) from the origin O. 

Analytic geometry is concerned primarily with loci as specified 
by algebraic equations. This is by far the most useful technique. 
Shortly we shall see that loci specified by geometrical conditions 
can usually be specified equally well by algebraic relations. 

















C. Straight Lines 

Instead of discussing the properties of the equation for a straight 
line in E 2 and later extending the discussion to E 3 we will utilize 
the vector technique to write down the general form for a line in 
E 3 . With this in view we can later as examples develop the various 
strictly algebraic forms. 

There are several methods by which a straight line can be rep- 



By and large the most useful of these representations is that par¬ 
ticular one which uses the definition of the vector as a DIRECTED 
LINE SEGMENT to describe the straight line. 

Consider the line L; the point P lies upon L, and the vector u 
originating at P is constructed to lie along L. 

The general description of the line L is provided by an equation 
for the position vector r which terminates only upon L. 

The definition of r can be obtained by a vector addition of the 
vector Rq which locates the point P and a vector nu lying upon L. 
Notice that n is a number (positive or negative) which varies the 
magnitude of the vector u. 

In general therefore 
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r = nu + R 0 


Thus whenever we are given a point on the line and a vector 
parallel to the line we can define the line by the vector equation 
above. 

Let 

R 0 = x 0 i + y 0 j + z 0 k 

and let 

u = ai + /?j + yk 

then 

r = (na + Xo) i + (n/? + y 0 ) j + (ny + z 0 ) k 
An equivalent equation is 

r = xi + yj + zk 

Therefore we can write 

x = na + Xq 
y = n/? + y 0 
z = ny + z 0 

These are called the parametric equations for the line L. The 
variable parameter is n. 

In the most general case a, /?, and y are numbers. 

If we define the vector u as a unit vector such that 

a 2 + (3 2 + y 2 = 1 


then n is the distance from P to the point of termination of r. 

In other words under these conditions n is distance from (xo, y 0 , z 0 ) 
to (x, y, z). 

Now consider two lines L and L' intersecting at P 
Let L be defined by 

r = nu + Ro 

and let L' be defined by 
r = mv + Ro 















































If both u and v are unit vectors the cosine of the angle 8 between 
L and L' is found quite readily by taking the dot product of u and v. 

u • v = | u|| v | cos 8 = aa' + /?/?' + yy' 

If | u | = | v | = 1 then 

cos 8 = aa ' + /?/?' + yy' 

In case u and v have not been normalized (made into unit vec¬ 
tors) then 


cos 8 = 


_aa' + /?/?' 4- yy'_ 

V « 2 + jS* + y 2 V« 2 ' + P 2 ' + y 2 ' 


Suppose that u and v are unit vectors. We can illustrate the 
power of this representation by finding the distance between a point 
Qi on L and a point Q 2 on L', where Qi is located a distance n 
from P and Q 2 a distance m from P. By the law of cosines 
d 2 = m 2 + n 2 — 2mn cos 8 = m 2 + n 2 — 2mn (aa' + /J/J' + yy'). 

As a numerical example of these 
manipulations compute the angle 8 
when P = (1, 2, 3), Qi = (3, 0, 1), and 
0,2 = ( 0 , - 1 , 2 ). 

The solution is readily obtained if we 
first construct the parametric vector 
form for the line. 

First consider L (defined by U) 

R 0 = i + 2j + 3k 

The vector U connecting the two points P and Qi is 

U = (3 — 1) i + (0 — 2) j + (1 — 3) k = 2i — 2j — 2k 



We can convert U to the unit vector u; 


U 2i — 2j — 2k 1 f . . . , 

u =-= — * — = — {1 — 1 — k} 

|U| V4 + 4 + 4 3 
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Then L is given by 



In the same manner we can define L'. 


V = (0 - l)i + (-1 - 2)j + (2 - 3)k = -i - 3j - k 
The unit vector v is 


v 


V 

|V| 


-i - 3j - k _ __1_ (_i — 3i — k) 

y/l + 9 +- V"' 


Thus L' is given by 



_E = {_li_3j-k} + {i + 2j + 3k} 

VTT 


or x = ^ + 1 

Vn 

^3m + 2 

vn 

z = ^r+ 3 

Vn 


107 

























The cosine of the angle between L and L' is merely 


^ (i — j — k) 

V = cos <f> = -— r =-- 

\/3 


X 


(— j — 3j — k) 

VT\ 


or 


cos <t> 


(-1 + 3 + 1) _ 3 

V33 V33 


We have shown how one constructs a line between two points P 
and Q in the previous example. It is important to recognize that 
this method of analysis (the parametric vector) is the only method 
which holds equally well in E 3 or E 2 . Other representations which 
we shall consider must carry conditions limiting their application 
to one space or the other. 

Because the slope of the line in two dimensions is of utmost im¬ 
portance we can now examine our general parametric form in two 
dimensions so that we can see most clearly the role of the line slope. 

Regard the point at which L intersects the x axis and the angle 
</> which L makes with the x axis. 



The slope of L in two dimensions is defined as the tangent of </>. 
Using the vector parametric form with u a unit vector we see that 
the slope of L is given by 


tan <j> = slope = 



u • i a 
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In three dimensions we can speak of the slope of the projections 
of L upon the 3 possible planes; (x,y), (y,z), and (z,x). 

In two dimensions we see that if the slope (tan <f>) is given then u 
the unit vector is specified. 


Since tan <f> = ft/a 

then cos </> = a 

and sin <j> = /? 

We then can write 

u = cos<f> i + sin<£ j 



In many cases when only the slope is given it is well to remem¬ 
ber that 

1 

COS <t> = —7==== 

\/l + tan 2 <(> 


sin <f> = 


tan<f> 


\/l + tan 2 <£ 


thus 




\/l + tan 2 <j> 


(i + tan<£ j} + Ro 


The customary equation that is utilized to represent a line L in 
two dimensions is the linear algebraic equation in two unknowns. 

The locus of all points in E 2 whose coordinates satisfy an 
algebraic equation of the type 

I 

ax + by + c = O 

is a straight line, where a, b, and c are numbers with a and b not 
simultaneously zero (otherwise our equation reduces to c = o). 
The parametric equations obtained from the vector form are, 


x = na + x 0 
y = n/? + y 0 
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If the parameter n is eliminated between these two equations we 


find that 

1 

1 « 

II 

O 

1 

>- 

II 

e 

and y — 

y 0 = — (x — x 0 ) = tan <j> (x — x 0 ) 
a 

This equation has been named the point-slope equation and as 
we see it arises naturally from our more general form. By further 
manipulation we obtain the linear form 


fix — ay — /?x 0 + ay 0 = O 

Thus 

/3 is proportional to a 
— a is proportional to b 

and 

— /?x 0 + ay 0 is proportional to c 


In three dimensions the elimination of the parameter n serves to 
give three algebraic relations. 


(X-Xo) (y - yo) _ (z - ?o ) 

« fi y 

Later we will see that these equations result from the two simul¬ 
taneous linear algebraic equations in three unknowns. In three 
dimensions the linear equation represents a plane. Thus two simul¬ 
taneous linear equations represent the intersection of two planes, 
which is a straight line. 


Representation of a line in £2 by the perpendicular vector 

A more restricted vector form for a straight line in two dimen 
sions can be set up using the linear algebraic form directly. 
Construct a constant vector 


no 


W = a i + b j 



If all of the points (x, y) are rep¬ 
resented as a position vector 

r = xi + yj 

The equation of the straight line 
is written 

W • r = —c 


At first glance we see that W • r = ax + by = — c which agrees 
with our original linear form. 

The vector equation has several interesting properties. 

1. The line defined by W • r = — c 
consists of the terminal points of 
all of the vectors r. 

2. The line is perpendicular to W. 

3. The line intersects W at a dis¬ 
tance ~ c ■ from the base point O. 

I w | 

An important manipulation which 
one must be prepared to make is the 
transformation from one characteristic 
form to another. This is readily achieved 
if we recall that 



and 


“a” is proportional to ft 
b is proportional to — a 


Since the components of W are a and b we can construct the com¬ 
ponents of u in the parametric form in the following way: 


and 


a = 


p = 


- b 


Va 2 + b 2 


Va 2 + b 2 
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Since c = — /?x 0 + ay 0 , knowing a and /? we can find R 0 by 
choosing a particular x 0 and solving for y 0 ; 

c + i8x 0 

y» =-— 

a 

To illustrate the use of the method of the perpendicular vector 
take the following problem: 

Find the distance from the origin to the line. 

2x + 3y — 6 = O 

The perpendicular vector from O is 

W = 2 i + 3j 


The distance from O to L is the value of the projection of r 
upon W. 



D. The Plane 

The plane in E 3 is the locus of all points satisfying the relation 
ax + by + cz + d = O 

assuming that a, b, and c are not simultaneously zero. 
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z = O this form becomes the equation for the straight line in the xy 
plane, since this condition of z = O defines the intersection of the 
plane W • r + d = O and the xy plane. 

The perpendicular distance of the plane from the origin is given 
by the projection of r on W, 

-L distance* O —> plane = r • ^ =--— 

P |W| |W| 


It is clear in all of these cases that if W is constructed as a unit 
vector w, 


w = 


W 

|W| 


Then in the equation 


w • r + 8 = O 


the magnitude of 8 always gives the 
perpendicular distance in question. 

As an exercise obtain the equation of 
the line through the point (3, 2, 1) which 
is perpendicular to the plane. 

x + 2y + 3z — 6 = 0 



The sign _L is shorthand for the word perpen¬ 
dicular. 
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Notice first that W is perpendicular to the plane; 

W = i + 2j + 3k 

We normalize; and the vector u of the line L is parallel to the 
perpendicular vector w. 


i + 2 j + 3 k 1 

— wr — _ s£ - — - 


u = w = 


(i + 2j + 3k) 


VI +4 + 9 04 

Because L must pass through (3, 2, 1) we know Ro, 

R 0 = 3i + 2j + k 

Therefore the parametric equation of L is 

= nu + R » = (^i +3 ) i+ (^ + 2 ) J+ (7S + ') k 

Then 



n = V14(X - 3) = V<v - 2) = V <z - 


E. Curves in E<i 
Conic Sections 




From such approaches one obtains very special quadratic forms 
which later can be generalized. 

The Ellipse 

A brief mention will be made of the locus definition of the 
Ellipse. 
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In addition to these pictorial characteristics 
of the two dimensional quadratic the curves 
belonging to this group are customarily defined 
in terms of length invariants, such as the 
ellipse which is generally defined as the 
LOCUS consisting of all points Q such that 
the sum of the distances between Q and two 
fixed points Pi and P 2 is a constant. 
















The Ellipse is the locus consisting of all points Qsuch that the 
distance | U | of Q to Pi plus the distance 1 12 1 of Q to P 2 is a fixed 
number K (where K > 2c; 2c is the distance between Pi and P 2 ). 

The origin of coordinates O has been placed at the point of sym¬ 
metry, and Pi and P 2 have been constructed upon the xi axis at dis¬ 
tances ± c from O. 



and 

where 


Because 

h = r - ci 
1 2 = r + ci 
r = xii + x 2 j 
the constraint 

\h\ + \h\ =K 
becomes 

I r — c i | + |r + ci| 


or 

\/(xi — c) 2 + X 2 2 + V( X 1 + c ) 2 + x 2 2 = K. 


After squaring both sides of this equation we rearrange to main¬ 
tain the square root on one side, and then square again. The result is 


x r 


x 2^ 


b 2 


= 1 


where 


a = K /2 and b = \/a 2 — c 2 


This form is called the normal form for the ellipse. This particular 
form is a special case of the two dimensional quadratic with 
An = l/(a) 2 , A 12 = A 2 i = O, A 22 = l/(b) 2 . 


It is easy to perceive that the ellipse crosses the x axis at (a, o) and 
(—a, o) while it crosses the y axis at (o, b) and (o, — b). The Points 
(c, o) and (— c, o) are called the foci of the ellipse. 
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a = the length of the semi-major Axis 
b = the length of the semi-minor Axis. 

We observe that b would be imaginary if a 2 < c 2 , or K < 2c. 

A x * 



The eccentricity of the ellipse is denoted by e and defined as 

e _ c _ \/a 2 — b 2 | for b < a, focii on the | 
a a | x axis. j 



a is the hypotenuse of a right triangle having legs b and ae 
O < e < 1 for the ellipse. 

The circle is a degenerate form of the ellipse; it arises when 
e = o and b = a 
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The Hyperbola 



In the previous section we asked for the locus of all points the sum 
of whose distances from two points was a constant K. To obtain the 
hyperbola we ask a similar question. 


What is the locus of all points Q such that the difference of the 
distances of Q to two fixed points Pi and P 2 is a constant? 


Again because of the symmetry set the two points Pi and P 2 on 
the xi axis at xi = c and at xi = — c. 


Then 

with 


h = (xi — c) i + x 2 j 
I 2 = (xi + c) i + X 2 j 

|li| ~ Ibl =±K 


Once again we square rearrange and square again giving 

xi 2 _ x 2 2 _ { 
a 2 b 2 


The eccentricity e is defined as in the case of the ellipse, noting 
that b 2 —> — b 2 thus 

e —. £ — \/a 2 + b 2 
a a 


Pictorially we observe that the curves tend to two straight lines 
for large values of xi and X 2 . 

When — and — are much larger than 1 the equation for the hy- 
a b 

perbola can be approximated by, 



X 2 2 

b 2 


« 0 


Thus for large distances from the origin 

X2 — ± ~ Xi 

a 

defining the lines toward which the curves tend. These lines are 
called asymptotes. 



If we change the signs such that 

X2 2 _ Xx 2 _ . 
b 2 a 2 
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we define a new set of hyperbolas with focii on the X 2 axis. This set 
is called the conjugate set to the first set which had focii on the xi 
axis. 


The Parabola 

In the preceding cases we dealt with loci which were defined in 
terms of distances from two fixed points. 

The Parabola is the locus of all points Q which are equi-distant 
from a point P and a line L. 



The only axis of symmetry in this problem is the line through P 
perpendicular to L. Therefore set this line along the xi axis while 
L is aligned with the X 2 axis (or y axis). 

|1| = 1 = IUI = |ls| 
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Now 


li = (xi - 2c) i + x 2 j 
h = X! i 

Thus 

IUI = |1|| 

becomes 

V(xi — 2c) 2 + x 2 2 = yft? 
and — 4cxi + 4c 2 + X 2 2 = 0 

Solving for X 2 2 

x 2 2 = 4c (xi — c} 




F. The Quadratic Form 



Introduction 


The reader should be cautioned because the strictly two dimen¬ 
sional geometric analysis in no way prepares him to ^Eriticipa l^jJji e 
elegance and generality of the quadratic form in three or N dimen¬ 
sions. The world of science contains many subjects which are de¬ 
scribed in terms of quadratic forms, and the analysis of many of 
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these subjects depends upon the utilization of the most general 
properties of the forms. 

Inner products between vectors may result in quadratics of par¬ 
ticular interest. A trivial example is the inner product of a vector 
with itself giving the square of the magnitude of that vector; for ex¬ 
ample in two dimensions using matrix representation 


r • r 


[xi, x 2 ] 


Xl 

X2 


Xl 2 + X 2 2 


If a constraint is placed upon this operation which requires that 
the inner product be equal to a constant, the resulting equation is 
that of the circle. 


r • r = R 2 gives xi 2 + x 2 2 = R 2 


In the example above we can think of the operation as one in¬ 
volving a transformation matrix. In this case the identity or unit 
matrix must be used. 


r • r = r • 1 • r 


[xi,X 2 ] 


(. 



= xi 2 + x 2 2 = R 2 


On the other hand a more general two dimensional quadratic 
can be constructed by using a general matrix A instead of the unit 
matrix. 


r 


A. r = 

= X1A11X1 + xiAi 2 x 2 + x 2 A 2 iXi + x 2 A 22 x 2 
= AnXi 2 + (Ai 2 + A 2 i) xi x 2 + A 22 x 2 2 


Xl 

X 2 


The constraint that this equation be equal to a fixed number K 
provides a general two dimensional quadratic which is symmetric 
with respect to the origin. 

Because we are dealing with real numbers (i.e., the values of xi 
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and x 2 are real) and because one generally starts from the scalar 
quadratic equation no generality is lost by utilizing a symmetric 
matrix A, 

Ai 2 = A 2 i 
or A mn = A nm 

The most general equation for a conic section may contain linear 
terms describing the translation of the center of symmetry. The 
operation to be described pertains only to the ellipse and hyperbola. 
The parabola does not contain a symmetry point; thus the operation 
of translation to be described will not apply in this one case. 



The translated form (using a symmetric A) is 


Anxi 2 + 2Ai 2 xix 2 -+■ A 22 x 2 2 + dxi 4- ex 2 + f — O 


We first write this equation in the matrix form 
r*A*r + i*r + f= 0 
where £ (a constant vector) is defined as 


« = 


,or £ = [d, e] 
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Writing out the matrix form in detail, 


, , (An Ai 2 \ 

[X1 ' X21 (a 2iA ,J I 


Xl 

*2 


+ [d>e] 


xi 

X2 


+ f = O 


The reader should keep in mind that 

A12 = A21 

We assume further that the posi¬ 
tion of the symmetry point O' is 
located by the position vector R hav¬ 
ing components h and g, i.e., 


R = 



*3 where 



We can then replace r by the vector addition 

r = (R + r')* 

r' has the variables xi' and X 2 r 

Substituting into r • A • r + £ • r + f = O, we find 

(R + F) • A • (R + r') + l • (R + r') + f 

Expanding this equation, 

R • A • R + R • A • r' -f r' • A • R + 

r' • A • r' + £ • R + £ • r' + f = O 


and r' 


' = [z] thenR 


* Remember the rules for vector addition. If R = 
r' = h + Xl ; or in the base vector representation R + r # = (h + xi') i 

U + X 2'J 

(g + x 2 ') j. 
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The form of the quadratic when expanded about the symmetry 
point is 


A • r' = K 


where K is a scalar. 

If the expansion above is to reduce to this form the terms linear 
in r' must vanish and the constants must be grouped. 

Thus 

(R • A + |) • r' + r' • A • R = O 

and 

f + R* A R + l-R = -K 

Warning. At this point the reader must exercise his knowledge 
of matrices . The results of these two equations will be developed 
to give the components of R and the constant K, but all of the 
rules of matrix multiplication and manipulation as outlined in 
Chapter / must be used . 

Consider the first equation, 

(R • A + l) • r' + ? • (A • R) = O 

We first note that R • A = (A • R), then 

(( A • R) + f) • r' + r' • ( A • R) = O 


A • R is a vector say p. Since the variables of r' are real we note 
that 

r' • p = p • r' 

or 

f'-(A-R) = (A-R)-r' 

Thus our equation simplifies to 

(2(A^R) + 1} -r' = O 
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Because the components of r' are arbitrary the vector in curly 
brackets must be zero. 


2 ( A • R) + f = O 
or 

A-R= -±£ 

Under the assumption that A is not a zero matrix, the last equa¬ 
tion can be cleared in order to obtain the components of R, the trans¬ 
lation vector. 


A-i • A • R = R = - | A" 1 • | 

In Chapter I the components of the inverse matrix A -1 were 
derived 


Akn' 


(— 1 ) k+n minor A n k 
Det A 


Since 


( An A12 

A21 A22 


A" 1 = 


( A 22 — A 2 i 
— A12 An 


|A| 


Operating upon £ with A 1 we find* that 


h 

1 

/ A 22 

-a 2 A 

d 

.g. 

21 A | 

\-a 12 

An/ 

e 


* Remember that the magnitude of a matrix A by definition is the determinant of 
that matrix; | A | = Det A . 
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or 


h - 2p~"[ ( A22d ” A 2 ie) 

and 

g = ” 2 1 \\ | ( A 12^1 + Ane) 

In a specific problem An, A 12 = A 21 , A 22 , d and e are given; thus 
h and g are readily obtained. (Note that | A | = (An A 22 — A 12 2 ). 

Once values have been achieved for h and g the constant K can 
be evaluated 


K = —R • A • R — £• R — f 

or 



= -Anh 2 - 2A X2 hg - A 22 g 2 - dh - eg - f 


It would seem that an overbearing amount of formalizing has been 
utilized to obtain this result. To the reader familiar with the tradi¬ 
tional approach to quadratics in two dimensions it is apparent that 
these equations of translation can be achieved by elementary 
algebraic methods. 

The matrix method actually achieves the solution with the least 
amount of detailed expansion in terms of components. It should be 
noticed that the manipulations up to the final writing of the result 
are purely symbolic. 

This approach has the attribute that it not only works with two 
dimensional problems, but can be applied in exactly the same 
manner to problems in N dimensions. A quadratic problem in 3 or 
more variables can be readily solved by matrix techniques; whereas 
the elementary approach becomes increasingly tedious. 
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Properties of the Quadratic Form 

The plane curves form a useful illustration of the quadratic form. 
In the preceding section the two dimensional quadratic was written 
down in its most general form. 

An x l 2 + 2Ai2 Xi X2 + A 22 x 2 2 + dxi + CX2 + f = O 

During the discussion of this problem it was mentioned that the 
techniques of translating to the symmetry point and those tech¬ 
niques to follow would apply to the quadratic in N dimensions, 

N N N 

22 x * A « x i+ 2^ xj + f=0 

i=i j=i j=i 


r*A*r + ?T + f= 0 
h 



where r and £ are vectors having N elements while A is an N x N 
symmetric matrix 

Ay = Aji 


In the same manner as before we can translate this equation to 
the symmetry point by letting 

r = R + r' 

where R is an N dimensional constant vector. 
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Under these operations we find that 

R = 

and 

K = -R- A R - £-R - f 

Substituting for R 

K=ii-A->-(A-A-')-{ + |{ -A-l-J-f 

= f- {*(X=*S + iA-») -t-f 

After translation we are left with an equation which involves no 
linear terms in r, (the primes can be dropped at this point). 


or 


r • A • r = K 


^ ^ ^ x m A mn X n = K. 


If the reader still finds the notation bothersome expand again the 
two dimensional problem (i.e., set N = 2): it is good for the soul. 

Our problem from here on will basically consist of the develop¬ 
ment of methods by which we eliminate the off diagonal terms or 
the terms involving x\ Xk, where 1 7 ^= k. 



to a rotation of the coordinates about the symmetry point to a posi¬ 
tion in which the coordinate axes are aligned with the symmetry 
axes; the semi-major and semi-minor axes in the case of the ellipse. 
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The Discriminant 




After the translation of the quadratic to the symmetry point the 
reader should notice that the basic matrix A of the quadratic is the 
same. Thus we say that the matrix A is invariant under the opera¬ 
tion of translation. 

Minus the magnitude of A in the quadratic is called the discrim¬ 
inant. Usually the discriminant is defined for the two dimensional 
case; however we shall extend the name to mean minus the magni¬ 
tude of A in N dimensions. 

In two dimensions, as an example, the negative of the discrim¬ 
inant is 


An A12 

A 2 i A 2 2 


= An A 22 — Ai 2 A 2 i = An A 22 — Ai 2 2 


The conic sections are labeled by the 
properties of their magnitudes. 

1 . The Ellipse. 

| A | = An A 22 — Ai 2 2 > O 

The circle is a special case in 
which Ai 2 = O and An = A 22 . 

2. The Parabola. 

I A | = O 

In this case no attempt is made 
to translate since either An or 
A 22 is zero and Ai 2 is zero. 

3. The Hyperbola. 

I A | <0 

This problem has a symmetry 
point. It should be noted that 
two intersecting straight lines is 
a special case of this form. 
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Rotation of the Quadratic Form 

Considering the symmetric form 
r • A • r = [xi, x 2 ] 


•or 


/ An Ai 2 \ 

X 1 

\ A21 A22 J 

_ x 2 


= K 


An xi 2 + 2Ai 2 xi x 2 + A 22 x 2 2 = K 



The rotation of the coordinates im¬ 
plies a linear orthogonal transformation of the x’s, 


written 


xi' = (cos <£>) xi + (sin <f>) x 2 
x 2 ' = ( — sin <j>)x i + (cos<J>)x 2 


r' = S • r = 


' cos <t> sin 
sin <£> cos 


In the same manner x m can be defined in terms of the x\ s (re¬ 
fer to Chapter 1) 


or 


xi = (cos </>) x\ + ( — sin <£) x 2 ' 
x 2 = (sin <(>) xi + (cos <f>) x 2 ' 


/ COS </> 

— sin <#>\ 

Xl' 

ysin <f> 

cos <t>/ 

X 2 ' 


Although we are specifically writing out the problem of two 
dimensions the reader should keep in mind that the operations are 
in general orthogonal transformations and can be applied to N 
dimensions. 

Going back to the two dimensional quadratic form 
An x i 2 + 2Ai 2 xi x 2 -f A 22 x 2 2 = K 
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we can substitute the expansions of xi and X 2 in terms of xi' and X 2 '. 
Upon performing this substitution we obtain a new equation 


Bn xi' 2 + 2B12 Xi' x.2 + B22 x 2 >2 — K 


The reader should perform this substitution and show that in 
the case of the orthogonal transformation 




1 A | 

= |B| 

or 





An A 22 

- A 12 2 

= Bn B 22 — B 12 2 

Also 





An 

+ A 22 

= Bn + B 22 

and 






K 

= K' 


( 1 ) 

( 2 ) 

( 3 ) 


The sum of the diagonal elements of a matrix is called the trace 
and is written 


Tr A = An + A 22 

° r ^ 

Tr A — / t Ann 

all n 

The two important invariants of the quadratic (or bilinear) form 
under an orthogonal transformation are 

a) The magnitude of | A | 

i.e. | A | = | B | 

b) The trace of A 

Tr A = Tr B 

We have demonstrated these invariants in the two dimensional 
case and have indicated that this can be done by direct substitution 
of the expansions of xj in terms of the Xk'’s. 

This demonstration can be performed in a much more sophisti¬ 
cated and powerful manner by utilizing our symbolic matrix 
notation. 
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Starting with 

r* Ar = K 

we substitute 

r = S-i • F, and r = F • (IP 1 ) = r' • S 

giving 

r * Ar = (?'• S)-A-(S-i-r') = F• (S • A• S _1 ) • r' 

= F • B • F = K 

Then 

S • A - S- 1 = B 


This orthogonal transformation of A is called a similarity trans¬ 
formation. 

The invariants under a similarity transformation can be demon¬ 
strated quite readily. 

1 . Invariance of | A | 

Note that 

|B| = | S • A • S -1 1 = | S || A || S -1 1 

Because the magnitude of an orthog¬ 
onal matrix is one; 

| S | = 1 and | S _1 1 = 1 
we have the result 

|B| = | A | 



2. Invariance of Tr A = 


2 a. 


The elements of S • A • S ” 1 are 


N N 

Bmn = (S • A * S 1 ) mn = ^^S m k Afcl Si n ' 

k=i l=i 

Consider 

N N N N 

TrB = 2 B m m = A X1 Sim' 

m= 1 m= 1 ksl 1 = 1 
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If we sum over m first remembering that S • S _1 
Smk Si m — 



we find 
Tr B 


N N N N N 

22 Aki {2 Sim ' Smk} = 22 Aid 5,k 

kcl 1=1 m= 1 k = l 1=1 

N 

Tr B = ^A kk = Tr A 

k= I 


Again! if the reader finds this too elegant, perform the sum 
directly for N = 2, i.e., for a two dimensional case. 



implies that the semi-major and semi-minor axes do not lie along 
the coordinate axes. 

Starting with a general quadratic one of the most useful problems 
is that of finding that particular orthogonal transformation (rotation) 
which makes the quadratic diagonal, in other words we rotate to 
eliminate all terms of the form 2Ai m xi x m where 1 ^ m. 

Associated with this we also want to discover the lengths of the semi- 
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major and semi-minor axes. These lengths are called the eigenvalues 
of the quadratic. 

We will first obtain the rotation to normal form by the technique 
most often described in elementary texts. In the following section we 
will employ a more powerful and more convenient method using 
matrices. 

The simplest method of relating the normal form to the general 
form is by writing the normal form in x x ', X 2 and rotating through 
an angle a to the general form. 

The normal form is 


xi 


'2 


2 + b 2 




We can relate x x ' and X 2 to x x and X 2 by means of the equations 
of rotation (or the rotation matrix). 


r = 


xi 

X 2 ' 


= S • r 


giving 


x x ' = (cos a) xi + (sin a) X 2 
X 2 = ( — sina)x x + (cosa)x 2 


Substituting for x x ' and X 2 r in the normal form, we obtain 
Ax x 2 4- Bx x x 2 + Cx 2 2 = 1* 


where 


A = 


cos* a 


sin^ a 


b 2 


B = 2— - 


b 2 


cos a sin a 


_ sin 2 a . cos 2 a 

* Here we have used An = A, 2A X 2 = B and A 22 = C. 
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The discriminant is invariant, 


B2 _ 4AC = -4 cos2 + sin 2 a _ 

a 2 b 2 

(does not depend on a). 

In addition to this 


a + c = -!—i—L 

a 2 b 2 


The angle of rotation 


cos(2a) = 


C - A 

B 


4 

a 2 b 2 


Diagonalization of the General Form 


The diagonalization of the quadratic in N dimensions can be 
obtained by more general techniques. Once again the two dimen¬ 
sional quadratic will be utilized to illustrate the general method. 
Starting with the non-diagonal form 


?• A 


r = 


[xi, x 2 ] 


/An Ai 2 \ 

Xl 

\ A 2 i A22/ 

X 2 


Anxi 2 + 2Ai 2 xix 2 + A 22 x 2 2 = K 


We want the transformation (rotation in two dimensions) S which 
produces the diagonal form, f* where; 


[xi', x 2 '] 


/r„ 

0 \ 

Xl' 

1° 

r,,/ 

X 2 ' 


ru Xl ' 2 + r 22 X2 ' 2 = k 


The orthogonal transformation from the form A to the diagonal 
form T is obtained by operator S ; where 

r = S*” 1 • r', or r = r' • S 

* T is a special case of the transformed matrix B discussed in section F-2. 
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Substituting for r we obtain 


r • A • r = r' • (S • A • S” 1 ) • r' = r' • f • r' = K 


/ 


As before under the condition that S is an orthogonal trans¬ 
formation 


s . a • s - 1 = r 

‘ or 

/Si! Si 2 \ /An Ai 2 \ /Sh'Si 2 '\ /r n o 
\S 2 1 S 22 J \A 2 i A 22 J yS 2 i' S 22 7 y o r 22 

Several things must be kept in mind; 

S“ x = S 

Su S 2 i\ 

Si 2 S 22 / 



second, while S is not symmetric, always remember that A is 
symmetric 

Ai 2 = A 2 i 


To accomplish the diagonalization of A we utilize the basic 
characteristic of S , namely that the magnitude of r is invariant 
under the transformation S . 

r • r = r' • r' 

We multiply | r | 2 by some constant X so that the product equals 
the constant K; then subtract this equation from the quadratic, 


Subtracting 


we obtain 


r-A-r = r'-r-r' = K 

A r • I • r = 

= A ?' • I • r' = K 


? • { A - XI } • r = F • { f - XI} • r' = O 
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At this stage in the development of the analysis of quadratics a 
similarity transformation of the symmetric matrix A has been made 
taking it to another (congruent) matrix I\ Although it was claimed 
that there exists one rotation which makes Y diagonal, the proof 
was not given. 

Using the 2x2 matrix as an example for simplicity we can 
demonstrate that A can be diagonalized and simultaneously that 
the actual algebraic operations can be represented graphically. To 
accomplish these results let us return to the general quadratic, 

r • A • r = r' • T • r' = 1 

From here on K will be taken as 1 without any loss of generality. 

Both of these bilinear forms can be thought of as the equations 
for an ellipse. 

The final quadratic representing the equation for two straight 
lines was obtained by subtracting the bilinear circular forms from 
the elliptic forms. The circular forms were 

A r • 1 • r = A r' • I • r' = 1 

where A equals one over the square of the radius of the circle; 
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The subtraction represents a solution which gives two straight 
lines passing through the points of intersection of the circle and the 



ellipse. The geometric construction of r • ( A — A 1 ) • r = 0 indicates 


that there is a maximum and minimum value of R 


\— result- 

V\ 


ing from the two conditions under which the circle is just tangent 
to the ellipse. 

The existence of the upper and lower limit upon A can be demon¬ 
strated by solving the quadratic, 


r • (A — AI) • r = (An — A) xi 2 + 2Ai2 xi X 2 + (A 22 — A)x 2 2 = O 


for xi in terms of X 2 and by utilizing the fact that both xi and X 2 
must be real. Solving for xi 

x {— A 12 ± VA 12 2 — (An — A) (A 22 — A)}X2. 

1 (An -A) 

or 

Xi = { A 12 ± V- | A —All) x 2 
(An - X) 


The condition that xi and X 2 be real requires that 


A 12 2 — (An —A) (A 2 2 —A) > O 
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Therefore the magnitude of (A — AI) must be negative definite; 

| A - AI | < O 

From the relation between xi and X 2 we find that* 

Amin ^ ^ ^ X max 

The proof of this statement is apparent from the geometric ex¬ 
ample. Algebraic proof is left as an exercise. 

The minimum value of A occurs when the radius of the circle is 
equal to the length of the semi-major axis 

R = a = 1/V^min 

In like manner the maximum value, A ma x> is obtained when the 
radius of the circle is equal to the length of the semi-minor axis 

R = b = l/\Amax 




* We have utilized the ellipse to illustrate this algebraic procedure. Later the 
problem of the hyperbola will also be analyzed as an eigenvalue problem. In 
this case the approach is exactly the same algebraically. The graphical analysis 
utilizes tangency to the primary hyperbola and to its conjugate in order to set 
the maximum and minimum The reader should demonstrate as an exercise 
that the circle tangent to the conjugate hyperbola is associated with a negative 
eigenvalue. 
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From the pictures of the maximum and minimum condition on 
A we see that the two straight lines defined by the intersection of 
the circle and the ellipse coalese into one line when A takes on 
either its maximum value or its minimum value. These lines are 
observed to lie along the principle axes of the quadratic, i.e., the 
semi-major or semi-minor axes. 

Because 

A„± V-I A— All ) 

(An — A) 


or 

x 2 = Ll A 12 ± V- |A -All) 

(A 22 - A) 

a single line solution is obtained only when | A — Al | takes on its 
maximum value 


| A — AI | = O 

The determinant of ( A — A I) is known as the secular determinant 
while 


| A - Al | = (A n - A) (A 22 - A) - A 12 2 = O 

is called the characteristic equation. 

Here the characteristic equation is a polynomial of second degree 
and it has two roots Ai and A 2 called the eigenvalues. 

Aj = (An ± A 22 ) ± ytAn - A 22 )2 + Aig2 

In general for a space of N dimensions the characteristic equa¬ 
tion is a polynomial of N th degree having N roots or eigenvalues. 
Let Aj be any one of the eigenvalues or roots. Because 

| A - AI | = | T - Al | = (Ai —A) (A 2 -A) = O 
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it is possible to find one rotation S which makes 


Ax o\ _ /r„ o \ 
\o \2J~\0 r u ) 


Aj can now be utilized in the remaining discussion in place of Tjj 
If this is the case 


(Ai-A) O 
O (A2 — A) 


— |! — A 1 | — (Ai — A) (A 2 — A) = O 


which satisfies the condition on the roots of the polynomial. For the 
case of N dimensions and N roots Aj, 


I A -AI| = | f — AI | = £ (Aj — A) = O 
where j j signifies a repeated product, 

j-j (Aj “ A) = (Ai “ A) (A 2 — A)... (A N — A). 

j=i 

The previous discussion shows that a symmetric matrix A can be 
diagonalized by a similarity transformation* and that the elements 
of the congruent diagonal matrix T are obtained as roots of the 
characteristic equation. 

Further these roots Aj are equal to one over the square of the 
lengths of the principle axes. 


*We can use this transformation to demonstrate that A can be diagonalized. 

S • A • S' 1 = r can be written A = S' 1 • l\ or 2 A mn S nj ' =2 s mn' r nJ . 

Assume that V is diagonal then ^A^Sn/ = S mJ ' Ty, or ^ ( A n>n - 5mn} 

Snj r = 0. A sufficient condition that this last equation have a nontrivial solution is 
that | A — Tjjl | = O. The existence of a solution demonstrates that f can be 
diagonalized, keeping in mind that S” 1 is orthogonal. This can be accomplished if 
A is symmetric. 
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This is made more obvious by examining the ellipse r' * f • r' = 1. 


f-r' 


[xi\ X 2 '] 


A, 

°) 

Xl' 

\o 

A 2 / 

X 2 ' 


/ 


= A, Xl '2 + A 2 x 2 ' 2 =^1 + ^1= 1 
a 2 b 2 

In other words r • f • r' = 1 is the normal form. 

The directions of the principle axes (semi-major and semi-minor 
in two dimensions) can be developed by utilizing the condition 
| A - AI | = O. 

If we impose the condition on the vector r that the linear form 


{A — AI } • r = O 

then a necessary condition that the solutions for the components of 
r be non-trivial (i.e., not all zero) is that 

| A -Al| = O 

This condition of course is that constraint which gives the eigen¬ 
values Aj as the roots of this equation. If these unique values or 
eigenvalues are then employed in the linear form, the solutions for 
the components of the corresponding vector are vectors parallel to the 
principle axes. 


{A — AjI } • Rj = O 

The constraint that this linear form vanish is a sufficient (not 
necessary) condition that the quadratic relationship be satisfied. In 
other words 


Rj • {A - AjI} • Rjj = O 


since Rj times a “null” vector is zero. 
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The solutions Rj (having components X m j) of 


N 

{A Aj I } • Rj = ^ i Ai m X m j 8i m X m j Aj = O 

m = 1 

are called the eigenvectors of the problem. We changed the notation 
from r to Rj because of the restricted nature of the R’s. They satisfy 
the linear eigenvalue equation in a non-trivial fashion. 

In the earlier portion of this section the characteristic equation 
was required in order that we obtain the single straight line solu¬ 
tions to the quadratic. We remarked at that time that these lines 
passed through the points of tangency between the circle and the 
ellipse. 

The eigenvectors are vector forms of those same lines because of 
their unique dependence upon the eigenvalues. More will be said 
concerning the eigenvectors in the next section. However, before 
proceeding let us restate the important consequences of this dis¬ 
cussion. 

1. The diagonalization of the quadratic is performed by finding the 
roots Aj of the polynomial | A — A1 | = O. 

These roots are called the eigenvalues. The length of the j th prin¬ 
ciple axis is one over the square root of Aj. 

2. The direction of the j 111 principle axis is determined by using Aj 
in the eigenvalue problem (A — Aj I) • Rj = O. 

The solutions Rj are called the eigenvectors and are directed along 
the principle axes. 

Before closing this section one unique quadratic, the circle, 
should be mentioned. This quadratic is degenerate in that the two 
eigenvalues are the same. Whenever a quadratic has degenerate 
eigenvalues, the associated eigenvectors are not specifically deter¬ 
mined in spite of the fact that they remain mutually perpendicular. 
To understand this the reader should consider the two dimensional 
problem of the circle. Any axis serves as say a semi-major axis. 
However, once one axis is chosen (arbitrarily) the second axis is 
fixed because it must be perpendicular to the first. 
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A = 



and 


I A — XI | = 


(7/16 - X) 3 y/3 
16 


3y5~ 

16 


GH 


= O 


When we expand the determinant and solve for X we find 

27 


(7/16 

with the solutions 

and 




(16) 2 


= O 


Ai = 5/8 - 3/8 = i = -L 


A* = 5/8 +| = 1 = J- 
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The equation 


Xl ' 2 X 2 ' 2 

4 1 


= 1 


is in the normal form. 


Another simple example of the problem of diagonalization is that 
of the hyperbola 


Xl x 2 = 1 



The eigenvalues give one over the squares of the distances from 
the origin to the hyperbola and to its conjugate. As in the case of 
the ellipse the problem has a set of symmetry axes. In solving this 
problem we will find that one of the fundamental lengths would be 
imaginary because one of the eigenvalues of the problem is neg¬ 
ative. This is the eigenvalue associated with the conjugate hyperbola. 

In the problem above 

An — O, A 12 = A 2 i = 1/2 and A 22 = O 


Then 
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| A — Xl 


—A 1/2 
1/2 -A 


= O 


The two roots are 


and 


*1 = Tn =-L= + 1/2 


a 2 — r 22 = — = — 1/2 


T , Xi'2 x 2 '2 , . , 

1 nus —-— = 1 is the normal form. 


/ 


The Eigenvalue Problem 

When the constant K of a bilin¬ 
ear or quadratic form is equal to 
zero the equation is degenerate in 
that it defines 

two straight lines 

Under such conditions the quad¬ 
ratic can be factored into the pro- fp 
duct of a set of linear algebraic ^ 
equations. 

Stating this in mathematical language, the equation, 







r • C • r = 


= [xi,x 2 ] ^ 


Cn C12 
C21 C22 y 


Xl 

X2 


= o 


is the equation of two straight lines. To show this expand the equa¬ 
tion above 

r • 1 • r = Cn + 2 Ci 2 X 1 X 2 + C 22 X 2 ^ — O 
This equation can be factored into the form 


(xi - mx 2 ) (xi — nx 2 ) = O 
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The relation above has two solutions; 


xi = mx2 
or 

xi = nx2 


In the previous section it was shown that in the case of a circle 
subtracted from an ellipse, these two lines passed through the points 
of intersection of the circle and the ellipse. 



The diagonalization of the quadratic was achieved by creating a 
new quadratic which had the constant term equal to zero. 

r • { A — AI } • r = O 

The extreme cases of this equation occur when only one line is 
defined, in other words the lines through the points of intersection 
coalese into one line and the circle is tangent to the quadratic A. 
When only one line is defined the magnitude of (A — AI) takes on 
its maximum value, zero. Under this condition particular values of 
A are obtained, and these are called the eigenvalues. As we have 
shown the eigenvalues are the roots of the characteristic equation 

| A - AI| = O 

We labeled these roots Ai, A 2 ,... An, or Tu, 1 ^ 2 , • • • Tnn- 
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Consider again the equation 


r- {A - AI} - r = O 
If this equation is zero it is sufficient that 
{A — Ajl } • Rj = O 

where Rj is a vector lying along the j th symmetry axis. 

In order to investigate the details of the calculation of the Rj’s 

call the two roots Ai = and A 2 = 7 ^. Each specific root then de- 

a 2 b 2 

termines a specific vector Rj called the eigenvector corresponding 
to the eigenvalue Aj. 


Write 


Then, 


{A — Ajl} • Rj 


R j = 




x 2j 



£ 

1 

rH 

H 

< 

A 12 \ 

Xu' 

\ A 2X 

(A 22 — Aj)/ 

X 2j 


This matrix multiplication gives us two algebraic equations 


specifying the ratio of the components of Rj 
A • Rj = Aj Rj 


or 


An Xij + A12 X-2j —* Aj Xij 
A21 Xij + A22 X 2 j = Aj X 2 j 

These two equations are not inde¬ 
pendent in that they give the same 
ratio of Xij to X 2 j. 
























or 


(Both forms provide the 
same result.) 


Xij_ 

Ai 2 

X 2j 

Au — Aj 

Xii 

A22 — Aj 

X 2J 

A 21 


The length of the vector can be set by normalizing Rj (setting 

IRJ I = 1). Rj • Rj = X xj 2 + X 2j 2 = 1. 


and 


Xu 2 J 1 + 1 = i 

J 1 (Al2) 2 | 


X xj 2 = 


(Al2) 2 


(An - Aj) 2 + (A 12 ) 2 



In general the eigenvectors (or 
principle axes) of a symmetric matrix 
are orthogonal. (Symmetric, A jk = 
A k j). It is apparent that many mat¬ 
rices have eigenvalues and eigenvec¬ 
tors; however, only the symmetric 
matrices will always have orthogonal 
eigenvectors. 

In the preceding section concern¬ 
ing diagonalization of the quadratic 
we solved for the eigenvalues of the 
ellipse 



3\/3 

+ —x lX2 



1 


and obtained 


and 
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To obtain the direction of the symmetry axes, we must solve the 
eigenvalue problem 


or 


Take 


{A - Xj 1}-Rj = O 


16 


) 3 f \ 

Xij 

' 16 \ 

/13 , \ / 


(l6 ” V/ 

X 2 j 


= O 


Ai = 1/4 (j = 1) then 


(l6 ~ ?) X " + ^ XZ1 = 0 X 21 = X n 


16 


V3 


Normalizing we obtain, 


Xu 2 + X 21 2 = I Xu 2 = 1, and X u = 


Giving in addition 


*21 = - \ 


Now take j = 2 or A 2 = 1, then 


(— — l) X 12 + ^r~ X22 = O X 12 = —f= X 22 


3 \/3 

16 


1 

V3' 


Normalizing 


X 12 2 + X 22 2 = X 22 2 1 + - 


= i;X 2! = ^ 


and 


* 12 =^ 
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We can see this result graphically. 




Using these eigenvalues we can 
readily find the eigenvectors. 

{A — Ajl } • Rj = O 


Take 


Then 


= O 


-i x „ + ixj, = o 


or 


After normalizing 


X n = X 21 


Ri 




, or Ri = — y= {ci + e 2 } 
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Taking 


j - 2; A 2 = - I 


and 


x 12 = - X22 = - | 


Thus 


R 2 = 


V2 


1J 


, or R 2 =-{ —<1 + ( 2 } 


The sign is arbitrary for this vector. The sign was taken in order 
that Ri, R 2 form a right handed system, (or that (Ri X R2) is 
positive). 


Properties of the Eigenvectors 


The relation between Rj and S -1 . 

It can be demonstrated in a relatively simple manner that the 
elements of the normalized Rj form the ROWS in the S” 1 rotation 
matrix. 

The eigenvalue problem is 

{ A — Aj I } • Rj = O 

or in terms of components 

N 

/ J A m n Xnj = X 1Y ij Aj = O 


The similarity transformation of A to the diagonal 1' can be 
written 


S • A • S' 1 = r 
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or multiplying through by S _1 


A • S ’ 1 = S " 1 • r 


The m 1 * 1 equation of this set is 

N N 

Si S n j' = ^ S mn ' f, 


nj 


but 


^nj — 5 n j Ajj and 

N 

\ ^mn S n j = S m j' Aj 


Therefore since S m j and X m j satisfy the same equations, 


Smj — X m j 


In two dimensions 


s-* = 



Also we find that 


Xn 

X 21 


= cot a = 



A21 

Ai — A22 


In our problem concerning the ellipse, a turns out to be —30°. 

In the problem of the hyperbola a = + 45 °. 

If the matrix A is symmetric 

i.e. A = A 

then it can be shown that the eigenvector solutions are mutually 
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orthogonal (providing that two or more eigenvalues are not the 
same. An equality of the eigenvalues is called a degeneracy). 

Consider the eigenvalue problem defining R n and the transposed 
eigenvalue problem defining R m . 



(A — A n I) • R n = O 


Rm-(A - X m I) = o|-R n 

Multiply the first equation by R ra from the left; multiply the 
second equation by R n from the right; and subtract. 

Rm A R n R m * A • R n — (A n — A m ) R m • R n = O 

If A = A then 

(An A m ) R m * Rn — O 


When n = 7 ^ m, (A n — A m ) is not zero (unless degenerate). 
Therefore 

R m ’ R n = O; m 7 ^ n 
If the R’s are normalized, i.e., unit vectors, 

* Rn — : 5 mn 


Summary and Applications 

It will again prove convenient to summarize the steps involved 
in solving either a quadratic or an eigenvalue problem. 

Given an equation of the type 

r • A • r = K 

or {A - Al}-R = 0 

we proceed in the following manner: 
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1. The eigenvalues are obtained as roots of the algebraic equation 

Determinant (A — XI) = | A — A1 | = 0 

If A is an N X N square array there should be N roots Aj. 

2 . The eigenvector Rj is obtained by substituting the specific 
root or eigenvalue Aj; 

{A — Ajl } • Rj = O 


or 


N 

2 a » 


Xnj — X n j Xj 


3 . The orthogonal transformation S which takes A to its diag¬ 
onal form T is obtained from the relation between the elements 
of S _1 and the components of Rj. 

Smj = X m j 

The importance of the eigenvalue problem is much greater than 
one would surmise by examining its applications to geometric 
quantities. 

The problem is used in physics to solve a large variety of problems. 

Many problems involving the small amplitude oscillations of 
coupled harmonic oscillators reduce in form to the eigenvalue 
problem. 

An example can be indicated. 

Consider two point masses suspended by three springs in a linear 
array between two rigid walls. 

Assume that the motion is confined to a linear vertical motion 
for each mass. 

The equations of motion for this coupled system constitute an 
eigenvalue problem. In this case the eigenvalues turn out to be the 
two characteristic frequencies of the motion. The eigenvectors of this 
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problem describe the two 
characteristic motions or 
modes as they are called. 

The lowest frequency mode 
(for equal masses) is a mo¬ 
tion in which the two masses 
are in phase as shown. 

The higher frequency mode 
.consists of a motion in which 
the masses are out of phase; 
that is to say that when one 
has a maximum displacement 
up the other has a maximum 
displacement down. 

A ll other motions of the system 
are constituted by a linear com¬ 
bination of the two fundamental 
modes. 

In order to illustrate this 
more vividly we can set up a 
simple problem and solve it. 

In the diagram to follow 
assume that the verticle dis¬ 
placements of Mi and M 2 are 
sufficiently small in order that 
the tensions in the springs are 
essentially constant and equal 
to T. Also let us take the case 
in which the masses are equal, 
i.e. Mi = M 2 = M. 

Because the motion is ver¬ 
tical we only consider the 
vertical components of T. The 
sum of the vertical forces on 
Mi (neglect gravity) is 



£ Lowest 

PLl GENFRE £>UERCY 


Highest 

EjGEHFK&pUEUCY 



ft, ''fej 

zr 





■ ■ • 4 




(xi x 2 ) T 

1 
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The sum of the vertical forces on M 2 is 



X 2 _ ( X 2 Xl) rp 


By Newton’s 2nd Law the vector sum of the forces on the body 
is equal to time rate of change of the vector momentum. In this 





M\j 

Mi 


case the masses are constant in time therefore the sum of the vector 
forces is equal to the mass times the vector acceleration. The forces 
are vertical thus we need only consider the accelerations along xi 
and X 2 (shown in the diagram). 
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and 


M ai = - T + -p- T 

Ma 2 = +-y-T - ipT 

To solve these equations we assume that the displacements xj and 
,x 2 are harmonic, i.e. 


x k — bk cos wt + Ck sin wt 

Under such an assumption the acceleration is proportional to the 
displacement. This will be proven in Chapter 5. 

ak = — w 2 (bk cos wt + Ck sin wt) = — w 2 Xk 

Upon substituting these values for the acceleration into New¬ 
ton’s 2nd Law we obtain 

— Mw 2 xi = ^pT + -pT 
— Mw 2 x 2 = p-T - 

Rewriting 

— 2fl 2 Xl + fi 2 X2 = — W 2 Xl 

and 

S2 2 xi — 2J2 2 X 2 = — w 2 X 2 

where 

ft 2 = T/Ml 


If we now construct a column vector p = 


xi 

X 2 
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and a matrix 


/ 2fl 2 -fi 2 
S2 2 2S2 2 


These two linear equations can be written 


{A — w 2 I } • p = O 

A non-trivial solution exists for the equation when 
| A — w 2 I | = O 


Thus there are two possible values of w 2 ; 


| A — w 2 I 


(2Q2 _ w 2 ) -fi2 
-fi 2 (2fi 2 -w 2 ) 


= O 


or 

(2B 2 - w 2 ) 2 - a* = o 


giving the roots 

Wl 2 = S2 2 = T/Ml 
w 2 2 = 3fi 2 = 3T/M1 

There are two possible values of w 2 , therefore the most general 
solution must be a linear combination of these. Let 


2 

x k = ^ akn {b n cos w n t + c n sin w n t} 

nal 


Later it can be demonstrated that bi, c*, b 2 and C 2 are evaluated 
from the initial conditions. The initial conditions are the values of 
x i and X 2 and their velocities at time t = 0. Returning to the solu- 
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tion Xk in terms of wi and W 2 we can write down the acceleration; 


2 

au = — ^ «kn w n 2 {b n cos w n t + c n sin w n t} 

n = 1 


Substituting this into the equations representing Newton’s 2nd 
Law (written in the matrix form rather than as sets of equations), 


a = 

ai 

/ 2S2 2 

= A • p = 

— S2 2 \ 

xi 


a 2 

\ —S2 2 

2S2 2 / 

x 2 


or 

2 

a k = ^ A w xi 
1=1 

Upon substitution for ak (the k th acceleration) we obtain 

2 

— ^>kn W n 2 {b n cos w n t + c n sin w n t} = 

n = 1 

2 2 

Aid 2 a In {bn COS w n t 4- Cn sin Wnt} 

1=1 n=1 


Collecting like coefficients of the bracketed terms and changing 
the order of summing on the right hand side, 

2 2 

A k i «in — a k n Wn 2 } • {b n cos w n t + c n sin w n t) = O 

n= I 1=1 

For this equation to vanish 



= O 
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or 


{A — w n 2 I } • a n = O 


where 


(In 


«ln 

«2n 


Thus this coupled oscillator problem reduces to the eigenvalue 
problem. Knowing that 


wi 2 = S 2 2 

and 

w 2 2 = 3fl 2 

The reader should be in a position to show that for this partic¬ 
ular problem, 


and 


r 


«n 

_i_ 


_a 2 i_ 



« 2 


1 

V 2 


r 


«12 

_-i_ 


_«22_ 


The general solution can be written 


P 


xi 

X 2 



cos w n t + C n sin w n t} 


n= 1 


or 

x i = {t>i cos wit + ci sin wit} + a.\2 {b2 cos W2t + C2 sin W2t} 
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Using the information that the velocities Vj are given by* 


V = l£ = 

dt 


dxi 



"dT 


Vi 

dX2 


v 2 

dt 




/ 


£ 

= ^ a n w n { —b n sin w n t 4- c n cos w n t} 


* The reader should be able to demonstrate from the orthogonality 
of the eigenvectors a n that; 


bm = a m • p (t = o) = Xi (t = o) aim + X 2 (t = o) a 2m 


and 
Cm = 



V (t — °) — V! (t — o) aim H- V 2 (t = o) a 2m 


* The derivatives shown here will be completely discussed in Chapter 5. They are 
included for completeness and need not be understood in detail. 
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CHAPTE 


Functions 

A. Introduction 



Tables of Relations 

MANY TIMES IN EXPERIENCE one en- 
counters correlated numbers. More specif¬ 
ically given two numbers x and y say arising 
from a set of observations; for every value of 
y there is a value of x. We can tabulate 
corresponding values of x and y. 

Suppose for instance we observe N differ¬ 
ent values of x and obtain a value of y for 
each x. A typical example of such a pro¬ 
cedure would be to write down the temper¬ 
ature in a room as a function of the height 
from the floor at a specified position on the 
floor. Thus for every vertical distance h from 
the floor there is a corresponding temperature 

T(h). 


'Y 

** 


b, 

a 3 


a* 

i 

i 

i 

• 

i 

1 

1 

1 

1 

\ 

» 

i 

\ 
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t 

\ 

a* 

\ 

\ 

% 

1 

b* 


165 



















Graphs 


The table of corresponding values of x and y can be plotted. 
Such a plot is shown below. 



A smooth curve has been drawn through the points. With a dis¬ 
creet set of points the curve which one draws through points and 
connecting them is not unique. 

To illustrate this idea two other curves are shown which satisfy 
equally the correspondence exhibited in the table of Section A. 



We then realize that in order to draw a unique curve through a 
discreet set of points a detailed specification of the behavior of the 
curve in the interval between points is necessary. 
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B. The Definition of a Function 


If we specify an algebraic relation between the variables x and 
y, the variable y can be defined as a function of the independent 
variable x. 

f(x) is said to be a function of the independent variable x in a 
specified interval x = a to x = b if there are one or more values 
of f(x) for each value of x in the interval. 


C. Definition of an Interval 

The interval is said to be closed if the end 
points a and b are included. 

Closed Interval; a < x < b. The interval is 
said to be open if the function f(x) is defined 
for all points in the interval between a and b but excluding the end 
points a and b. 

Open Interval; a < x < b. 

Interval Open at one end; a<x<bora<x<b. To illus¬ 
trate the types of physical situations in which the interval of defini¬ 
tion must be clearly specified, consider a length of string L vibrat¬ 
ing between two rigid thin plates which support it. For the sake of 
simplicity assume that the string vibrates as a pure sine wave; the y 
displacement at any point x being given by 

sin wt 


y(x, t) = a 0 sin ^x^ 



Note carefully that this equation must be qualified as holding in 
the region o < x < L. 
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Because the physical problem states that the string is present 
only from x = o to x = L it is not proper physically to consider 
the solution outside of the interval. 

This example is characteristic of many problems in which the 
model function exists mathematically outside the region of defini¬ 
tion but has no physical meaning. There are interesting mathe¬ 
matical cases in which the function may not be defined at a point 
or in a region. In such cases care must be exercised in using the func¬ 
tion because spurious results may arise from the inadvertent inclu¬ 
sion of a point at which the function is not defined. 

As an example suppose we look at the area enclosed between the x 
axis and the curve given by 1 /x 2 in the interval from a to b (a and 
b both positive). If by chance we attempt to let “a” go to x = o 
the area under curve is undefined. 



The area in the interval a < x < b = - —- 

a b 

From the relation above we see that the term - is undefined at 

a 

a = o. This example also demonstrates that the interval cannot be 
taken in such a manner as to include the point x = o. In other 
words we cannot discuss the area under the curve when a lies on 
the left of x = o and when b lies on the right even though the rela- 
1 1 

tion-exists. 

a b 
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D. Multiplicity: 


/ 

For a given value of the independent variable x there can be one 
or more values of f(x). If there is but one value of f(x) for every value 
of x the function is said to be single valued. Higher multiplicities 
are labeled similarly. 


Single Valued Functions 


If for every x in the interval a —» b there is one and only one 
value of f(x), f(x) is said to be single valued. 

As an example the parabola is shown below. 

y = 2x 2 

° r f(x) = 2x 2 





Double Valued Functions 

If for every a < x < b there are two and only two values of the 
function, the function is said to be Double Valued in that interval. 
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It should be noticed in the case above that the function g(y) is 
double valued in an interval open at one end 

g(y) double valued o < y < N 

where N can be taken to be any arbitrarily large number. 

g(y) is single valued at y = 0 

A second example is the circle 


x 2 + y 2 = R 2 



If we require that f(x) be a real number then the reader can easily 
verify that 

f(x) is double valued — R < x < R 
f(x) is single valued at x = ±: R 
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A Function which is Triple Valued in the Interval 


—b < x < b 

is illustrated at the right. This particular 
function is double valued at 

x = ± b 

* and is single valued in the two open 
ended intervals 

— N < x < — b and b < x < N 



Higher Multiplicity 

There are many examples of functions which have a higher 
multiplicity. One familiar example is the inverse trigonometric 
function. The sine and cosine functions are single valued, i.e. 

f(x) = sin x is single valued. 

The inverse function 

g(y) = sin" 1 (y) -1 < y < + 1 

is multivalued. The interval is interesting in 
that the multiplicity in the open interval 

— 1 < y < l 

is twice the multiplicity at the ends of the 
interval. 

ly I = i 
fW 
























E. The Slope at a Point 

Definition 

Consider the graph of a curve 
f(x). 

The tangent to the curve f(x) at a 
specified point P is a straight line 
which we shall label L. This line L 
makes an angle a p with respect to 
the x axis. 


A line tangent to a curve f(x) at 
one point touches the curve once 
and only once in the vicinity of the 
point. Because the tangent line L as 
we have drawn it is extended to the 
x axis the curve may oscillate and 
cross this line L at other points. 
Such a situation of course has no 
bearing on our argument. 

For this reason we will deal with 
the angle between L and a line 
parallel to the x axis in the vicinity 
of P. 

In a slope measuring operation, 
performed with a straight edge, it is 
difficult to judge the exact slope. 
There are simple algebraic methods 
for approximating the slope at a 
point. 

Later we shall see that these algebraic methods lead to a tech¬ 
nique by which we determine the exact slope at P. These later 
methods are those of the differential calculus. 


The slope at P = tan a p . 




172 


Approximate Slopes 

f(x) 

The tangent at a point can be ob¬ 
tained approximately by consider¬ 
ing a small segment of arc at P. At 
P the tangent touches the curve 
f(x) at one point in a region very 
close to the point in question. 

We observe from the diagram at 
the right that a line L' passing 
through the points (x p + e) and 
(x p — c) at the ends of the arc is 


Later we shall find that the de¬ 
rivative is obtained by allowing c to 
become arbitrarily small. 

If 2e is of the same order as the 
change in f(x) between (x p + c) 
and (x p — c) the approximation 
may be relatively good. 

On the left an extreme example 
is presented of a case in which the 
curve fluctuates to a great extent in 
the interval chosen. 


x t e x,-*c 

The line L makes an angle a p with respect to the x axis (or a line 
parallel to the x axis). The line L' makes an angle a p ' with the x 
axis (or a line parallel to the x axis). 

As the length of arc (or 2c) becomes smaller the value of the angle 
a p ' approaches the value of the angle a p . We assume that as c be¬ 
comes arbitrarily small or zero; a p ' becomes equal to a p except for 


approximately parallel to L. 
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certain cases which we shall discuss. If the slope is undefined at P 
then our arguments as presented above do not hold. 



The slope of f(x) at P is defined as, 

slope at P = tan a p 

It is apparent that the slope can be approximated by computing 
«p' utilizing two points x p + c and x p — £ close to P. 

slope at P = tan a p ' 

We can compute the approximate slope from the values of f(x) at 
the two points x p + c and x p — c. 



f(X,+e)- f(x f -e) 
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From the diagram we see that 

tan < = fe + *) ~ ~ £ ) 

2c 

As an example of this type of approximation consider the para¬ 
bola which we used as an example earlier. 

f(x) = 2x 2 

Let us compute the slope at x = 1 (this is x p ) using as the 
interval c = 1/4. Thus x p + c = 5/4 and x p — c = 3/4. 
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Then 


tan a p ' = 


f(5/4) - f(3/4) 
2x1/4 


25 _ 9 

^ Q = 4 = slope x = 1 


This is a fortuitous result since the true slope at x = 1 is actually 4. 

It will be of interest in the sections to follow to note here that one 
can now plot the slope as a function of x. Since for every x we have a 
tan a x , we can plot tan a x vs x. In the diagrams to follow the sim¬ 
ple example of the parabola is used to illustrate the plot of tan a x 
as a function of x. 
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It is interesting in connection 
with the discussion of the slope to 
find that if we plot the displace¬ 
ment of a body as a function of 
time, the slope of the displace¬ 
ment vs time graph is the magni¬ 
tude of the velocity in direction of 
the displacement (the magnitude 
of a velocity is called the speed). 



| V y | p = slope of y(t) at t p 

If we now plot | V y | the slope 
of y as a function of t, the slope 
of the | V y | curve at any point is 
a measure of the magnitude | a y | 
of the acceleration in the direc¬ 
tion of the displacement. 

I a y I tp = slope of | Vy | at t p 

Thus the magnitude of the 
acceleration is the second slope 
of the displacement curve. 
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Matrix Approach to Slope Approximations 


1. ) Take a curve given by f(x) in 
the interval a< x < b. To make 
the representation simple shift the 
origin to coincide with a. 

2. ) Divide the interval a —> b into 
N sub-intervals of equal width c. 

3. ) The values of f(x) for every in¬ 
terval can be listed in a column i.e., 


f(x) = 


f(x a ) = f(o) 

f(<) 

f( 2 c) 

f(j€) 



f(Nc) = f(b) 


This tabulation of the values of f(x) can be thought of as a column 
matrix of N + 1 dimensions. 

When this tabulation is used the function can be written using 
the following subscript notation, 


f(o) 


~fo" 

f(0 


fl 

f(JO 


fj 

- f(NC) . 


fN 


4.) The computation of the approximate slope at a point x = jc 
(where j is an integer less than N) is a linear combination of the 
values of f(x) evaluated at (j — l)c and (j + l)e. 
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If f(x) = the function at x 
let 

d(x) = the slope at x 
Thus 

d ( j £ ) = f ([j + i]<)-f([j-no = 

J ’ 2c 

fj+i ~ fj-i 

2 c 

Because d(x) is a linear transformation of f(x) we can in general 
write 



d(me) = D mo f 0 + Dml fj + D m 2 f2 +.DmN fN = ^Dmn fn 


In order to make the notation consistent with the customary 
matrix notation, write 


Then 


f(nc) = f, 
d(mc) = d 


n 

m 


dm — ^ ' D mn f n 


By examining the slope at any point x = jc i.e. d( jc) = dj we find 
that 


and 


D, 


Dmn = o for n < (m — 1) 

for n = m 
for (m + 1 ) < n 

=-- n = m — 1 


m,m-l — 


2 c 

1 


D m m+ i = + — n — m + 1 
2 c 
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The slope taking operation now takes the general operator form 


where 



d = 


/ 




"do" 




fo 

dj 

= 

D-f = 

D- 

fj 

djj 




A 

) 

1 

O 

O 

o 

1 

O 

1 

O 

o 

) - 

-1 

o 

1 

o 

) 

O 

-1 

o 

1 

) 

o 

o 

-1 

o 

) 

o 

o 

o 

-1 



The example given utilizes a case in which N = 6 or in which 
D is a 6 x 6 matrix. This has been done merely to illustrate the gen¬ 
eral matrix form. Any N x N example serves equally well. 

There is one inherent difficulty with the operator D. The approx¬ 
imation of the slope at every point is quite good, 


except at the ends of the interval . 

In other words the values of d(o) and of d(Nc) are very in¬ 
accurate. 

To maintain the symmetry of this matrix we shall consistently 
ignore all end point values of the resulting column matrices, i.e., 
neglect results for x = o and for x = Nc. 

The operator (or matrix) D has many intriguing properties. The 
slope of the slope curve which we will label as a vector d (2) is the re¬ 
sult of using D twice. 

d< 2 > = D • d = D - [ D • f ] = D-D-f 


The matrix D • D has a most interesting form. It measures the 
approximate curvature of f at a point. The reader can develop the 
matrix D • D by multiplication. 
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giving 



/ — 2 

o 

1 

o 

o 

o 


° 

-2 

o 

1 

o 

o 

1 

1 

O 

-2 

o 

1 

o 

4c 2 

o 

1 

O 

-2 

o 

1 


\ ° 

o 

1 

O 

-2 

o 


\ O 

o 

O 

1 

CM 

1 

0 


\ 


By examination of the matrix above we find that at the point 
x = ne 


(D-D 


r \ _ fn-2 ~ 2f n + f n +2 

)n< 4c 2 


"Negative Curvature 



If we relate this algebraic equa¬ 
tion to the graph shown at the right 
it is apparent that the operation 
D • 1) merely subtracts twice the 
value of f(x) evaluated at x = nc 
from the sum of the values of f(x) 
taken at points on either side of nc; 
in other words at (n — 2) c and at 
(n + 2) c. 

If the curve f(x) is concave downward D • D • f is negative indicat¬ 
ing a negative curvature. 

If the curve is concave upward the operation D • D • f at the point 
in question is positive, or the curvature of f is positive at that point. 
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Because of the symmetry about the diagonal element in the ma¬ 
trix D the matrix D • D = D 2 appears to have some unnecessary 
spacing. The matrix 



removing every other element in the column matrix f. However for 
a consistent set of operations the symmetry about the diagonal 
should be maintained. 


Finally the matrix D can be shown to have an inverse. In other 
words given d(x) can we find f(x), 

D-i • d(x) = D-i • 1) • f = f( x ) 

Later in this chapter it will be demonstrated that the inverse of 
D is the operation corresponding to taking the area under a curve. 
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Thus it will be of some importance to compute D” 1 . 

To perform this inverse operation, as we have indicated, the in¬ 
accuracies at the end points begin to play a very important role. 
One technique by which we can avoid this difficulty is that of tak¬ 
ing an extremely large number of sub-intervals and work only with 
the central part of the matrix constantly ignoring the end points. 

Another method is developed by disregarding the symmetry of 
D and settling for a slightly more inaccurate estimate of the slope. 
This inaccuracy can again be minimized by taking a large number 
of sub-intervals. 

We will eliminate every other element in the vector starting with 
the first. In other words we eliminate the even values ofj. 


We then take a large number of 
intervals M and define the slope at 
x = me as 


d(mf) = f(m<)-f([m- l]c) 
e 


f(X) 



Then 



1 

o 

o 

o 

o 

o 

-1 

1 

o 

o 

o 

o 

o 

-1 

1 

o 

o 

o 

o 

o 

-1 

1 

o 

o 

o 

o 

o 

-1 

1 

o 

o 

o 

o 

o 

-1 

1 


\ 

/ 


where 

Dmn = O 



Dmm — + 

1/C 

and 

Dm,m—1 = 

1/e 


m < n < (m — 1) 
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The operator D 2 is much the same. However the computation is 
shifted slightly from the starting point. 


D2 = D • D = j- 




1 

o 

o 

o 

O 

O 

-2 

1 

o 

o 

O 

O 

1 

-2 

1 

o 

O 

o 

O 

1 

-2 

1 

O 

o 

O 

O 

1 

-2 

1 

o 

O 

o 

o 

1 

-2 

1 


:\ 


/ 


D 2 is obtained in its more compact form. However the curvature 
appearing in the j 01 position corresponds properly to the j — 1 posi¬ 
tion. The operations shift 1/2 of an interval for every application. 
This occurs because this particular form of D is shifted by one-half 
an interval. 

This form however is very useful for the purposes of computing D -1 . 

Remember 

n ' = (_ 1 )m+n c °f Dnm 


D n 


Using this equation to obtain the elements of D -1 we find that 

:\ 


D-1 = £ 


or 


/I 

o 

o 

o 

O 

o 

1 

1 

o 

o 

O 

o 

1 

1 

1 

o 

O 

o 

1 

1 

1 

1 

O 

o 

1 

L 

1 

1 

1 

1 

o 


1 

1 

1 

1 

1 

a' = € 



if n 

< 

m 


/ 


Dmn — O 


if m < n 


The suspicious reader can immediately check this form by carry¬ 
ing out the multiplication of 
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D-i - D = I 


D" 1 • D = - 
e 


1 

o 

o 

o 

o 

o 

1 

1 

o 

o 

o 

o 

1 

1 

1 

o 

o 

o 

1 

1 

1 

1 

o 

o 

1 

1 

1 

1 

1 

o 

1 

1 

1 

1 

1 

1 


\ 

/ 


1 ° 
-1 1 
O -1 

o o 
o o 
o o 



o 

o 

o 

o 

o 

o 

1 

o 

o 

-1 

1 

o 

o 

-1 

1 

o 

o 

-1 



(°° 
I o o 
\ o o 
\ o o 


o o 
o o 
1 o 

O 1 

o o 
o o 


o 

o 

o 

o 

1 

o 



The matrix D _1 is called the triangular matrix and represents 
an integration or the operation of taking an area under a curve. 

These operations at first sight may appear to be an unnecessary 
sophistication, however many problems in practical scientific work 
must be worked out numerically on computers. 

The methods which we are now developing in terms of the 
matrix symbolism are a concise formulation of the types of opera¬ 
tions which can be done on a computer. 

In computer calculations the information used is invariably in tab¬ 
ular form. The manipulation of this information to obtain the desired 
result is in many cases nothing more than the manipulations which we 
indicate symbolically in terms of our matrices. 


F. Continuity 

Problems of continuity in func¬ 
tions reach all degrees and shades 
of complexity. At this point we 
can at best merely note a few 
characteristic examples of con¬ 
tinuous and discontinuous func- 
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tions. We shall make a few defini¬ 
tions which are suited to our 
purpose of cataloguing the various 
types of curves which may be en¬ 
countered. 

/. Continuous Functions 

We shall say that a function is 
continuous in the vicinity of a 
point P if the curve of the slope 
d(x) in this region has no dis¬ 
continuities. 

The slope curve can possess a 
“cusp” (a kink, or break) and 
yet correspond to a continuous 
function. We shall gain some in¬ 
sight concerning this last state¬ 
ment when we discuss the area 
under the curve representing the 
function. 



foo 




A function f(x) has a “cusp” at a point P if 
the slope plot is discontinuous at P. That is to say 
that the slope changes from one value to a differ¬ 
ent value at the point P, depending upon the 
side from which the point is approached. 

If the slope of the d(x) curve is d (2) (x) at every 
point x, it is interesting to note that d< 2 >(x) is 
singular at P, when P is a point of discontinuity 
in d(x). 
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If the magnitude of the func¬ 
tion becomes very large on either 
side of P and arbitrarily close to 
P, and if the function has the 
value N at P where N is so large 
that it is undefined, we say that 
f(x) is singular at P. 

It is worth while to mention 
several singular functions in order 
to illustrate this case. 

The slope of a step function is 
singular at the step. 

Let 

f(x) = 0 —N < x < P 

and 

f(x) = + 1 P < x < N 

Note that we use an open interval 
to the left and a closed interval to 
the right in order to define the 
value of f(x) at the discontinuity 
P. Then the slope curve is illus¬ 
trated at the right. 

The function is said to be un¬ 
defined at a singularity. 

We can see how the singular 
slope of the step function is ob¬ 
tained in the limit by examining 
a function f'(x) which is not quite 
a step function. 
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The d'(x) curve has a width VV 
at 1/2 the maximum value. It 
should be intuitively (?) obvious 
that as f'(x) approaches a step 
function, the width W approaches 
zero, and the maximum value of 
d'(x) becomes infinite. 

Singularities can be of many 
types; however, some character¬ 
istic examples are 

1. The function which is posi¬ 
tive on both sides of the singu¬ 
larity. 

2. The function which changes 
sign upon passing through the 
singularity. 

A familiar example of this is 
the tangent 6 curve at 0 = *jt/ 2 . 

One final means of charac¬ 
terizing a singularity, is to meas¬ 
ure the area under the curve in 
an interval containing the singu¬ 
larity. 

There are but two types of 
curves relative to this criterion; 
those having a finite area under 
the singularity and those having 
an infinite area under the singu¬ 
larity. 


G. Areas 


Regard the curve representing 
the function f(x) defined in the 
interval a < x < b. The area un¬ 
der the curve between x = a and 
x = P is shown as the shaded 
region. 

This area for practical purposes 
at this point can be obtained by 
various approximate techniques. 



Approximation Methods 


First plot the curve on paper of 
constant density and thickness 
and then cut out a standard rect¬ 
angular area and also cut out the 
area in question. By comparing 
the ratio of the weights of the two 
cutouts we can obtain the area 
under the function from x = a to 
x = P. If the sides of the standard 
rectangle are li and I 2 , then the 
area is li I 2 . 
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Therefore, 


Weight of area under f(x) from x = a to x = P _ area under f(x) 
Weight of rectangular area (li • 1 2 ) ~~ (l x • 1 2 ) 


If f(x) is plotted upon cross hatched 
paper we can count squares in the 
region in question. 

Rectangular Partitioning 
This approximation method for 
finding the area under a curve is 
extremely useful. Thus it is worth 
our while to examine this method in 
detail. 



Our problem is to obtain an approximation to the area under 
the curve f(x) between the point a and a point x = P. 

To do this let us divide the interval a —» P into n equal segments 
of width c; thus 



n 


We set c small enough that the 
slope of f(x) does not change by 
great amount in the interval c. 




slope relatively constant 
in the interval c. 



c is too large in this case. The slope 
changes decidedly in the interval. 
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The method of rectangular partitioning is performed by passing 
a horizontal line through f(x) at the mid point of each interval c. 
The horizontal lines when joined to the vertical lines at the extrem¬ 
ities of the interval £ form a rectangle. 

This procedure is illustrated below. 

f(x) 



Regard the j th interval. 

The area of the j th interval = c-f(a + [j - 1/2] c). 

The total area between a and P can be obtained by summing the 
n rectangles. 


j^n 

Total Area, a-> P = A a _*, as ^ «• f(a + [ j - 1/2] e) 

j*l 


To illustrate this let us take a case 
in which a = 0 and P = 10. Assume 
10 intervals. 

n = 10 and £ = 1 


Then we make a table. 



j 

j — x h 

f(a + [ j - 'A] f) 

1 

>/2 

(('A) 

2 

% 

f(%) 

3 

% 

f(%) 

4 

% 

m 

5 

% 


6 


f(-H) 

7 


f('%) 

8 


f('%) 

9 


f('%) 

10 


f('%) 
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If we now add all of the f(a + (j - l/2)€)'sand multiply bye 
we have our approximation. 


Ao-io - c • {f( 1/2) + f(3/2) +.+ f( 19/2)} 


A simple numerical example will further clarify our method. In 
the case of the parabola f(x) = 2x 2 find the area from x = o to x = 1 
using two intervals, i.e. 

n = 2;c = 1/2 



Our table becomes 


j 

j - v> 

f(a + Lj - ‘^1 <) 

1 

Vi 

f(‘/«) = ■/« 

2 

% 

II 


and 

A -’ i= l{i + !) =5/8 


The true value of this area by 
integration is 

A». = f 


The fraction error in our calculation then is 


2/3 - 5/8 _ J_ 
2/3 16 


Trapezoidal Partitioning 

It should be a reasonable assumption that once the interval 
a * P * s split up into n sub-intervals of width e, various connect¬ 
ing curves can be used to form the top of the area defined by e. 

The same considerations concerning changes in slope over the 
interval c still apply. 
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Another useful partition method is that of forming trapezoids. The 
points at which the extremities of the interval c cross f(x) can be 
connected by straight line segments to form areas of the shape shown 
below. If the curvature has predominantly one sign in the interval 
this method either overestimates or underestimates the area. Whether 
the evaluation is high or low depends on the predominant sign of the 
curvature. 



This shape is a trapezoid of area 

j f(a+j<) + f(a+ [j + no j 

A typical trapezoidal partitioning is shown below. 

























































In this case 


A a ^P = « • I f (a) + f(a + c) + f(a + c) + f(a + 2c) 


+ — f(a + (n — l)c) + f(a + nc) | 

2 I 


J = n 


- c ’ 2 f ( a + J c ) ~ l f ( a ) “ \ f(a + nc) 
l j=o t *■ 


To illustrate this method let us approximate the area under the 
parabola used before in the interval, 0 to 1. 

f(x)= 2x 2 


Then if n = 2, c = 1/2. 


X 

j 

f(a + jc) 

0 

0 

f(0) = 0 

% 

1 

f(‘/ 2 ) = V4 

1 

2 

f(l) = 2 


A 0 ^l | f (°) +^f(l/2) , f(l/2) + f(l) | 

= iji + 5/2 | =V4 

The fractional error = = — - 

2/3 8 

We notice that the method of rectangular partitioning is better 
than that of trapezoidal partitioning for this particular function. This 
large error for the latter method is caused by the overall positive 
curvature of the function. 

The choice of method is a matter of judgment depending on the 
special problem to be solved. 
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The Area under the Slope Curve 

In this section we will prove a very important theorem. 

The area under the slope curve of a function f(x) obtained by cord 
approximation in the interval from x = a to a point x is equal to the 
value of the function at x, f(x) minus a constant. The constant is f(a). 

The approach is quite straightforward. 

1. We first will obtain the approximate slope curve d(x) or tan a x by 
the cord approximation in the interval c. 

2. Second we will find the area under the approximate slope curve 
from a to x by rectangular partitioning. 

As before let us first construct our function f(x) in the interval 
a < x < b, splitting the interval into n sub-intervals of width c. 
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Fortunately we have arranged our intervals in a convenient man 
ner since 


j=n 

[Area Slope Curve] a ^ x —• d (a + (j - 1/2) t) 

j= 1 


but 


d ( a + (j — 1/2) c) from our approximation is 
approximately 


f(a+jc)-fla + (j- l)c) 

€ 


Then 


[Area Slope Curve] a -* x j ^ + ^ ~ f(a + ( j ~ 1)e) 


“2 < f(a + “ f < a + U “ !)«)} 


~{f( a + e) -f(a) + f(a + 2e) - f(a + c) + . 
+ f(a + (n — l)c) — f(a + (n — 2)c) 

+ f(a + nc) — f(a + (n — l)c)} 

— f (a + nc) — f (a) 


Since x = (a + nc) 

[Area Slope Curve]~ f (x) — f(a) = f(x) — a constant 

We are left with the final extension in the integral calculus to show 
that this is an exact equality. 
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It is instructive to illustrate this with a simple numerical example. 
If we make use of the parabola 


f(x) = 



in the interval from 


0 < x < 10 


take Five intervals from 0 to 10 



knowing the values of tan a x or d(x) as function of x we can com¬ 
pute the area under the slope curve from x = 0 to x = 6. 


j = 3 


[Area]^ 6 = c*^d(0 + (j - l/2)<) 

j=i 

= 2 {d( 1/2) + d(3/2) + d(5/2)} 


= 2 

2 6 10 

= 2 

18 


10 10 10 


10 


36 

10 


18 

5 


Now — f(a = o) + f(x = 6) = — 0 + — = — 
' V ’ 10 5 


This result demonstrates our theorem. 
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Area Computations in Terms of Matrices 


Earlier the slope of a curve f(x) was represented as a linear 
transformation. Previous to this section the area computation was 
shown to be the inverse of the slope calculation. This inverse rela¬ 
tionship becomes even more apparent when the area calculations 
are represented in terms of a linear transformation. 

To a reasonable approximation the area between the x axis and 
f(x) in the interval 0 to ne can be represented as 

j= n 

A (nc) = area from 0 to nc ^ ^ e • f (jc) 


Write f(x) as a column vector having its j th element given as 

f j = f(j«) 

with 



~f(<) 


V 

f = 

f(2c) 


f*2 


_f(Ne)_ 




Also let A (x) (the area between 0 and x) be a vector whose ele¬ 
ments are 

A n = A(nc) 

where 



A(e) 


V 

A = 

A(2e) 

— 

A 2 


A(Nc) 


As 
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Then 


and 


n 

An = Anj fj 

A n j = € j < n 
Anj = O n > j 


/ 


With this approximation the area matrix A assumes a familiar 
form, 




A = 


V 


O 

O 

O 

O 

O 

1 

O 

O 

O 

O 

1 

1 

O 

O 

O 

1 

1 

1 

O 

O 

1 

1 

1 

1 

O 

1 

1 

1 

1 

1 


A is the triangular matrix and as such is the inverse of the slope 
matrix D; 


A = D-i 


Note carefully that we are working with an approximation 
method. Thus the results obtained here can only be considered as 
approximate relationships. Later we will observe that the operation 
of integration (area computation) is the inverse of differentiation 
(slope computation) in one definition of the integral. 

It is interesting to observe that both operations involve half 
interval shifts, however the shift of A is compensated for in the re¬ 
verse shift of D. 

Using the techniques available at this point we can solve simple 
problems. As an example solve the following problem. 

The slope of the function f(x) in the interval x = 0 to x = 8 is 
given by g(x) = x. Find f(x). 
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Symbolically this problem is written 

D *f = g 


or 


Take N as 8 then c 



V 


gi 

D- 

fit 

= 

g2 


In 




= 1 and g(nc) = nc 



C 


T 


2c 


2 

g = 

3c 

= 

3 


_8c 


_8_ 


The problem is solved quite readily for f by clearing with A 
or D -1 


A • D • f = I • f = f = A -g 


"ft" 

/I o o o o o o o\ 

T 

f 2 

/ 1 1 o o o o \ 


2 

f 3 

=z 

1 1 1 o o 


3 



1111. 





1 1 



_f 8 _ 

v . / 

Q 



o 
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After carrying out the matrix multiplication 


f = 


1 

3 

6 

10 

36 


/ 


Again keep in mind that in working out problems using these 
techniques constant care must be exercised with respect to values ob¬ 
tained at the end points. 
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CHAPTER 5 


Differential and 
Integral Calculus 


A. Introduction 


IN THE PREVIOUS CHAPTER the computation of the approx¬ 
imate area under the curve representing the slope at every point of 
a function f(x) was shown to be related approximately to the func¬ 
tion from which the slope was taken. This class of computations is 
the major interest of the differential and integral calculus. 

The derivative of a function is a measure of the slope of that func¬ 
tion at a point in the limit as the interval considered becomes 
arbitrarily small. 

There are a number of ways of defining the integral. In the discus¬ 
sion to follow the integral will be considered as the anti-derivative or 
the inverse operation to that of differentiation. 

The integral of a function f(x) is a measure of the AREA between 
the function and the x axis. Areas above the axis are positive and 
areas below the axis are negative. 
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«Xj 




The diagram shown here will 
serve to introduce the approach. 

The basic relationship between the derivative and the integral 
will be demonstrated with this simple diagram of a trapezoidal 
element. 


Instruction! Plot the area under 
the curve f(x) from 0 to x as a 
function of x. 

Let Ai_i be the area from 

O to Xi_i 

Let Ai be the area from 


O to x* 

Definition! 



A'RLAf 




X 


x. 


X 


X 


The symbol AA* will represent the difference between the area A* 
and the area A|_i. 
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Since Aj is denoted by the index at the right of the interval, the 
total area to a point Xj is given by 

j 

Aj = ^^ AA m = AAi + AA2 + ... AAj 

m a1 

= (A 0 — Ai) + (Ai — A2) + ... (Aj_i — Aj) 

As a special case take the function 


f(x) = 1 



The area of the rectangle from x = o to x is 

A(x) = 1 •x 

We can therefore compute and plot A(x) as a function of x. 

This is a particularly simple example in which the slope of A(x) 
is constant and equal to one. It is apparent here that the slope of 
A(x) equals f(x). 

The symbol A is used consistently to signify differences such as 


Xi_i — xj = Axi 


The term AAj = A* — Ai_i from the area curve is given approx¬ 
imately from the trapezoidal element of the f(x) diagram. 
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Define! f(xi) = | {f(xi) + f(xj_i)} 

= the mean value in 
the interval Ax* 

If the interval Ax* = x* — xj_i is sufficiently 
small the arc of f(x) in this interval can be con¬ 
sidered to be approximately a straight line. 

Then 


AAi = Aj - Aj_i = i {f(xO Axj + f(xj_i) Axj} 

= | { f ( x 0 + f ( x i-i)} Axj 
= f(xi) Axi 








In other words we have calculated the mean rectangular area. 
By dividing both sides of this equation by Ax* we can relate the 
average value of the function f(x) in the interval Ax* to the slope of 
the area curve. 


Mi-f 

Ax, ‘ 

where 11 = f(xj) 

The net result of this is that the area (as we would expect) is 
approximately the sum of the areas of the trapezoids f(xj) Ax,. 

'=i i=j 

Aj = 2 AAi = Axi 

i = I isl 

The integral of f(x) from xo to Xj is written as 

Aj = f f(x)dx = Limit f(xi) Axi 
Jo Axj—>o i = o 

206 




where the number of intervals between o and Xj becomes appro¬ 
priately large as the width of each Ax* approaches zero. 

In other words 


J=j 

Limit Axi = Xj — x Q = length along x between zero and Xj 

Axi->o i=I 


The number of intervals must be such that the sum of Axj’s gives 
the length Xj — x Q . 


We have introduced the term 


LIMIT. 


There are subtle problems involved in the consideration of the 
process of taking a limit; therefore we will leave the more extensive 
discussion of this subject to a separate section. 

We defined the integral as the limit of a sum of AAi’s. 

The derivative of A(x) will be defined as the limit of the ratio of 
AAi to Axj as Ax* approaches zero. 

The derivative with respect to x is denoted by the symbol 


d_ 

dx 


Derivative of A(x) = = Limit 

dx Axj—»o Axj 


Finally we note the anti-derivative characteristic of the integral. 


If 


then 



Limit f(xj) = f(xi) = —— 
Ax*—»o dx. 


The reader can rightly claim that 
than has been discussed. 


more has been implied here 
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What is the point? 

As an introduction we have demonstrated that 


f(Xj) =* 


AAi 

Axi 


or that the slope of the area curve A(x) is approximately equal to 
the average value of the function f(x) in the interval of interest. 
Furthermore we hope to demonstrate the equality 

f (x) = <£(*) 
dx 

by noting that the area given by the sum 


Aj 


= Limit ^ AAi = J dA 
Axj—>o .= i 


= Lim 
Axi—»o 



Thus 


= Lim ^ f(xi) Axi = f f(x) dx 
Axi-^o i=i 


f(x) 


dA (x) 
dx 


With this general survey of what is to follow we can turn to a 
detailed consideration of the limit followed by a development of 
the derivative and the integral (or anti-derivative). 
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B. The Concept of a Limit 

Limits of Various Example Functions 


The notion of a “limiting value” intro¬ 
duced in the preceding section is of crit¬ 
ical importance in mathematical analysis. 
We shall examine the notion briefly with¬ 
out specific reference to areas or tangent 
lines. 

1. The Sine and Cosine. 

In the diagram to the right we repre¬ 
sent the sine and cosine functions geo¬ 
metrically. 

If 9 —» O, then the vertical segment rep¬ 
resenting sin 0 also approaches zero. 

Limit sin 0 = O 

and conversely. In addition 




Cos0 = Vl - Sin 2 0 

hence as 0 —» O 

Cos 0—» 1 

These statements although elementary are useful as illustrations 
since they are geometrically suggested. 

2. Consider the Rational Function. 

(x 2 + 3x+ 1) 

(2x 2 - x) 
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As x —> oc (or as - —» 0) the expression above tends to the value 

1/2. This value is evident if we rewrite the rational function above 
as 

(1 + 3/x + 1/x 2 ) 

(2 - 1/x) 

The terms 3/x, 1/x 2 , and 1/x all approach zero as x is made 
arbitrarily large. 

3. The limit of the sequence. 

S n = 1 + x + x 2 + ... x n 


approaches the value 


1 - x 


(in the closed interval (| x | < 1) as n is 


made arbitrarily large. This is usually written “as n —» oo”. 

This result is demonstrated by multiplying S n by x and subtract¬ 
ing S n 


xS n - Sn = (x + X 2 H - X n+1 ) - (1 + X + --~X n ) 

Sn (x - 1) = S n+ i - 1 - Sn = - (1 - X*+l) 

Therefore 

_ (1 - x n +*) _ 1_ x n +* 

n ~~ (1 - x) (1 -x) (1 -x) 

If | x | < 1 then | x | n+1 —> O as n —» oo 


Limit S n = Limit —---^ = —-— 

n —» oo n—» oo (1 — x) 1 — x 

4. The function x —» 1 as x —> O. This statement can be demon- 

x 

strated geometrically or analytically. 
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If we represent sin x as a series expansion 


Sin x = y (~T *”"- 11 

6 ; < 2 " + 1 )! 



+ 

i 

X 

II 


3! 

then 

sin x _ j x 2 


x ~ 3! 

and 

Limit sin x _ j 
x —> O x 


The Formal Definition of a Limit 

With these examples in view we proceed 
to a more formal definition of a Limit. 

Let f(x) be a function (possibly many 
valued) defined for all x in some interval 
x 0 < x < b. Note that the interval can be 
open; in other words we allow the possi¬ 
bility that f(x) may not be defined at x G 
(orb). 

We write 



lim f(x) = L 
x-»x 0 ( + ) 

in this expression x 0 (+) is a value of x to the right of x 0 and arbi¬ 
trarily close to x 0 . 


L — f(x) —> O as x -»x 0 ( + ) 

i.e., if L — f(x) can be made arbitrarily small in absolute value by 
confining x to a small enough interval on the right of Xo, say 
x o < x < (x 0 + Ax), then we call L, 
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The limit from the right of f(x) 
as x —► x 0 from the right. 

In a similar manner, if f(x) is 
also defined to the left of Xo, say in 
the interval a < x < x Q , we can 
define the left-hand limit: 

lim f(x) = L' 

X-> Xo (-) 



This equation means that L' — f(x) can be made arbitrarily small in 
magnitude by confining x to vary in a sufficiently small interval 

Xo — Ax < X < x 0 


to the left of x 0 . 

If L = L' we may write simply 

lim f(x) = L or L' 
x —» x Q 


As an example of a function which has neither right nor left- 
handed limits we examine 



as x—» O 



This particular function oscillates more and more rapidly as x ap¬ 
proaches zero from either side, and the function is undefined at 
x = O. 
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In our definition of lim f(x) we have assumed that x Q is finite. We 
x—► x 0 



and 


lim f(x) 

X —> — 00 


to mean lim f(x) 


In our previous example of the sequence 
S n = 1 + x + x 2 + x n 

the quantity S n is viewed as a function of n where n takes on only 
positive integral values. For a function f(n) in which the variable can 
take on only integral values we make the definition of the limit 
analogous to those above: vis; 


lim f(n) = 

(ih° <+> 


lim f(n) = L 
n—» oo 


if L — f(n) can be made arbitrarily small by confining 
arbitrarily small interval 



to an 


o< 



< 5, (where 8 can be made arbitrarily small), 


or what comes to the same thing by confining n to suitably large 
values 


n 0 < n < oo 
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We are able to construct a similar definition for 


lim f(n) 
n — oo 


Continuity 

We can now give a more precise definition of continuity. Let f(x) 
be defined and single-valued at every point of an interval a < X < b 
(or a < x < b, etc.). Then f(x) is continuous in the interval if at each x 
in the interval we have 


lim f(x') = f(x) 
x'-+x 


(At the end-points we use the one-sided limits: lim f(x') = f(a), etc.) 

x' —> a( + ) 

It is easily seen that the examples above conform to the defini¬ 
tions we have just made. 

Simple formal properties. 


lim (f • g) = (lim f) (lim g) 
lim (f ±: g) = (lim f) ± (lim g) 
lim (f/g) = lim f 
limg 


XCX+1) 

IXI 



Why all of this worry? 

Some functions are discontinuous. 


As an example of a discontinuous function 
consider f(x) = - X ^ X ^ . 

' l*| 


For x > O we have | x | = x, hence f(x) = x + 1, 
and so f(x) —» 1 as x —> 0( +). For x < O we 
have f(x) = — (x + 1), and so f(x) —> — 1 as 
x —» 0( —). The function is not defined for 
x = O. The graph is shown at the left. 
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Fundamental Properties of Limits 

i 

Suppose f(x) and g(x) are defined for x 0 < x < b, and suppose 
that 


f(x) —> L and g(x) —» L' as x —> x 0 ( +) 
or, in the earlier notation, 

lim f(x) = L and lim g(x) = L' 

X-*Xo( + ) x > Xo( -f*) 

We claim that 

f(x) ± g(x) L ± L' as x —> x 0 ( + ) 
f(x) • g(x) —> L • L' as x —> Xo( + ) 
f(x)/g(x) L/L' as x—> Xo( + ), (if L' ^ O) 

The verification of the first of these we leave as an exercise. For 
the second and third put 

e (x) = f(x) - L rj(x) = g(x) - L' 

so that, by definition, |c(x)| and |tj(x)| can be made arbitrarily 
small by confining x to a sufficiently small interval to the right of 
x 0 , say x 0 < x < (x 0 + Ax). We must show that | f(x) g(x) — LL' | 
and | f(x)/g(x) — L/L' | can be made arbitrarily small by confining 
x to a sufficiently small interval to the right of x 0 . Now 

f(x) g(x) = (c(x) + L) • (ij(x) + L') 

= «*7J + €*L'+7)*L + LL' 

and so 

l f ( x )'g( x ) - LL 'I = I € • i? + c*L' + rj*L| 

< |e||i)| + |c||L'| + h||L| 

Clearly the right-hand side, hence also the left, can be made as 
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small as desired by keeping x 0 < x < (x 0 + Ax) with small enough 
Ax, since that is true of c and tj. 

Similarly, 

f _ _L _ c + L _ _ cL' + LL/ - Li? - LL' 
g L' i) + L' L' tjL' + L 2 ' 

cL' — tjL 
ijL' + (L') 2 

Therefore 


|f/g — L/L'| 


1<L' - t?L1 

|tjL' + (L') 2 | 


Clearly the numerator on the right can be made as small as desired 
by keeping x 0 < x < (x 0 4- Ax), with sufficiently small Ax. Assum¬ 
ing L' ^ O then L 2 ' and the denominator on the right cannot 
get arbitrarily small when x is near enough to x 0 . Q. E. D. 

These properties can equally well be proved for left-hand limits 
and for limits in which the independent variable x can only take on 
whole number values. We leave the verification of these modifica¬ 
tions to the reader. Briefly stated, the above rules simply say that 
the limit of a sum (or difference, product, quotient) of two functions is equal 
to the sum (or difference, product, quotient of the limits). The rules can 
easily be extended to any sum, product, etc., of a finite number of 
functions. 

An application. If f(x) is continuous in an interval a < x < b, 
then by definition, lim f(x + Ax) = f(x) for every x in the interval. 

Ax —> o 

If also g(x) is continuous in a < x < b, then, applying the above 
limit rules, so are (f ± g), (f • g), and also (f/g) except at points where 

g = o. 

Now f(x) = x is clearly continuous. Therefore f(x) • f(x) = x 2 is 
continuous for all values of x. Applying the same rule repeatedly 
we see that f(x) = x n is continuous for all x and for any n = 1,2, 
3, 4, etc. Also any constant function is obviously continuous. We 
conclude from our rules that any polynomial function 
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f(x) = a 0 + ai x +-+ a n x n 


is continuous, and finally we con¬ 
clude that any rational function 

f / x x _ a Q + ai x + .. . + a n x n 
b 0 + bi x + ... + b m x m 

is continuous except where the de¬ 
nominator vanishes. 



C. The Derivative 

Definition 

In section A of this chapter it was 
indicated that the slope as given by 
the cord approximation approaches 
the exact slope of the tangent line at 
a point if Ax* is allowed to approach 
zero in the limit. 



Approximate slope at x = — = ^ (1) 

Ax Ax 


In accordance with our earlier discussion we define the limit, 


lim Af 
Ax—>oAx 


(if it exists) 


( 2 ) 


to be the slope of the tangent line at (x, y); and consequently the 
tangent line is defined to be the line through (x, y) with slope given 
by (2). The value of (2) is usually indicated by 


dy Qt( . df 

-r°rf (y) or ;r 

dx dx 


Caution: 


dy 

dx 


is 


a symbolic abbreviation for 


lim 
Ax —» 


Ay 

OAx 


217 



























Now Ay/Ax is a true quotient. However dy/dx is not a quotient. That 
is to say, we do not define what dy and dx are, but only what ^ ls - 
Later on we shall give dy, dx independent meanings. 

The value of (2) is called also the derivative of f with respect to x, 
or sometimes the rale of change of f with respect to x. 

Needless to say, (2) may very well fail to exist—at least for cer¬ 
tain “bad” values of x. It is at any rate clear that if (2) does exist, 
then Ay must tend to zero with Ax. That is, we must have 

lim Ay = O, or 

Ax—»0 

lim f(x + Ax) = f(x) 

Ax—>0 


Consequently, if (2) exists for a given x-value, then f(x) must be con¬ 
tinuous at that x-value. However, it is possible to find examples of 
functions which are continuous for all values of x but which do not 
possess a derivative for any value of x. Thus the existence of a 
derivative is a stronger condition than continuity. However, all func¬ 
tions which we shall encounter are “differentiable” except possibly 
for certain isolated values of x. 

Observe: In (2) we have written Ax —» O and not Ax —» 0( + ) 
or Ax -> 0( -). That is, Ax may be either > O or < O. The defini¬ 
tion (2) is a two-sided limit. It is possible clearly to define the 

derivative of f(x) on the right (i.e., lim and the derivative 

Ax—»0( + ) ^ 

on the left. We shall not bother about those minor refinements. 



Io illustrate the algebraic technique for 
calculating the derivative of a simple func¬ 
tion, consider 


Then 


f (x) = X 3 


df(x) 

dx 


(x) 3 — (x — Ax) 3 

Ax 


lim 
Ax—>0 


Expanding this expression we find 


df 


lim 


{3x 2 — 3xAx + (Ax) 2 } 


/ 


dx Ax—»0 

In the limit the slope clearly tends to a value 


— = 3x 2 


dx 


We define 3x 2 then to be the slope of f(x) = x 3 at the point x. 
The tangent line to the curve f(x) = x 3 at x 0 is the straight line 
through (x 0 , x 0 3 ) having the slope 3x 0 3 . 

In general the function f = x n where n is a positive integer can 
be demonstrated to have the derivative 


^ x n = nx n_1 
dx dx 


The proof of this can be achieved by expanding the ratio 


df _ lim 
dx Ax-»0 


x n — (x — Ax) n 

Ax 


The expansion of (x — Ax) n is 

(x — Ax) n = x n — nx n_1 Ax + higher terms in Ax 
Subtracting from x n and dividing by Ax gives 
df _ lim 


dx Ax—»0 

= nx n_1 


nx n 1 + terms in Ax) 


Another interesting example is the derivative of sin x and 
cos x. 


— sin x = lim 
dx Ax—>0 


sin x — sin (x — Ax) 

Ax 
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Expand sin (x — Ax) 

sin (x — Ax) = sin x cosAx — cos x sinAx 


Remembering that 


then 


cos Ax —> 1 
sin Ax —» Ax 
For small Ax 


— sin x = sin x — sin x + Ax cos x 

dx Ax—>0 [ Ax 

= cos x 


The reader can use the same technique to show that 


— cos x = —sin x 
dx 


Six Golden Rules 


It is of great importance to lessen the task of calculating — f(x). 

dx 

This is done most conveniently by deriving formal rules of manip¬ 
ulation for the symbols — . 

dx 

The main formal properties of the operation — are listed below. 

dx 

In 1) — 4) we assume that f(x), g(x) are functions which have der¬ 
ivatives for some value x. That being so, then all the other functions 
(e.g., f(x) + g(x)) occurring in the formulas also have derivatives 
at x: 


1) | x {r(x)±g(x)}=^.f(x)-| x g(x) 

2) -j- (c f(x)} = c-p-f(x) (c = constant) 

dx dx 

3) £{f(x)-g(x)} = )g + f (Prod, rule) 
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5) -j- f(u(x)) 


g (x)<^ - f(x) 

g 2 ( x ) 


dg 

dx 


df.du 
du dx 


(quotient rule. If 
g(x) ^ O) 

(chain rule) 


6 ) ^ __!- 

dx (dx/dy) 


(inverse function rule) 


Formulas 1) —4) are straightforward rules which tell how to cal¬ 
culate derivatives of sums, products and quotients. Note that 2) is 
really a special case of 3) if we take g(x) = c. Formulas 4) and 5) 
will be explained after we show how to prove 1) —4). These really 

df 

follow very directly from the definition of — and from the properties 

dx 

of limits mentioned above. In fact, we leave the demonstration of 
1) and 2) as an exercise. 

In order to demonstrate rule 3, put 


f(x - Ax) = f(x) - Af 

and 

g(x - Ag) = g(x) - Ag 


where Af and Ag represent the change from f(x) and g(x) respec- 
tively when the variable changes by Ax. 


Then 



J- {f(x)g(x)} = lim | 

ax Ax—>0 1 

f f (x)g(x) - f(x - Ax) g(x - Ax) \ 

l Ax 


= lim 

Ax—>0 

f(x)g(x) - [f(x) - Af] [g(x) - Ag] 

Ax 


= lim J 

f fAs. + Af e 1 

| Ax Ax 8 Ax 1 


Ax—>0 1 


If Af and Ag — » O as Ax —> O then 



— (fg) = r^£ + g — 

dx 1 8/ dx 8 dx 
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Rule 4) is demonstrated in the same manner. 





[/f-Af\| 

A 

f 

II 

1 3 

0 

Vg - A g ; 

dx 

g 

Ax 


= 1 df_f^dg 

g dg g2 df 


= lim I 
Ax —> O 1 


Examples 

Let n be a whole number (n = 1,2,3, etc.). Then for f(x) = x~“ = 
we have, applying the quotient rule, 



n • 


MI _ i. d ( x °) 

dx_dx 

(x“) 2 


Since (1) = O and A^ = nx«~i, this becomes 
dx dx 


A (x - D) = _ nx 
dx x n • x n 


n . x (n-l) 


which shows that the rule 


_d_ 

dx 


x n 


n x n_1 


holds for all n — O, ±: 1, ± 2, ± 3, etc., and for all x if n > O, for 
all x =7^ O if n < O. 

The rules so far listed permit us to calculate the derivatives of 
any polynomial function. 
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f(x) — A 0 x n + Ai x n 1 + ... + A n (A n = constant) 


Namely, applying 1) repeatedly we have 


£ f(x) = A (A 0 X**) + A(A lX “-l) + ... +A(A n ) 


Applying 2), we get 


A f( x ) = A 0 A ( X n-) + Ax A ( X n-1) + ... + A n _! A ( X ) 

dx dx dx dx 


From the formula above, we obtain finally 

■ d 5 X ' = nA oX n_1 + (n — 1) Ai x n ~ 2 + ... + 2A n _ 2 x + A„_i 
dx 


Thus, 

and 


dx 


(3x 2 + 4x + 5) = 6x + 4 


— (x 3 + x) = 3 x 2 -f 1 
dx 


As an exercise consider what happens if we apply the operation 

■j- to the polynomial f(x) = A 0 x n + Ai x n_1 + A n ; 

dx 

(n + 1) times in succession. 

The quotient rule permits us further to differentiate any rational 
function. E.g., 

d I (3x 2 + 4x + 5) 1 
dx | x 3 + x j 

(x 3 + x) • (3x 2 + 4x + 5) — (3x 2 + 4x + 5) — (x 3 -f x) 

_dx___dx_ 

(x 3 + x) 2 

= (x 3 + x) (6x + 4) - (3x 2 + 4x + 5) (3x 2 + 1) 

(x 3 + x) 2 
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The Chain Rule” (i.e., rule 5) above) tells how to differentiate 
a function of a function. Namely, suppose that f(u) is a function of 

u defined for a u b, and suppose that f '(u) exists for some 

du 

u in that interval. Suppose further that u is made to depend on an¬ 
other variable x, say u = u(x), defined for a < x < /?; and let — 

dx 

exist for an x-value corresponding to the u-value at which — is as- 

du 

sumed to exist. Then f = f(u(x)), regarded as a function of x, has a 

derivative at that value x, and — = — ( u ) ( x ). 

dx du dx 


Let f(u) = y, and f(u + Au) = y + Ay 
Also put u(x -|- Ax) = u + Au 


Then f(u(x -f Ax)) - f(u(x)) _ f(u + Au) - f(u) 
Ax Ax 


_ [ f(u + Au) — f(u) l Au 
Au J Ax 


Let Ax —* O. Then also Au —> O, since Au is assumed to have a 

finite limit. Therefore, the above expression tends to — • — 

du dx 


Further Examples 

1. f(u) = sin u, u = 3x 2 + 5 


Then 


Or 


-! f(u(x)) = dsinu.du 

dx du dx 


= cos u 


. dy 

dx 
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~ sin (3x 2 + 5) = [cos (3x 2 + 5)] 6x 


f 


2. f(u) = cos u, u(x) = sin x 


Then 


Or 


df d du 

— = — cos u • — = — sin u • cos x 
dx du dx 


— cos (sin x) = — [sin (sin x)] cos x 
dx 


3. f(u) arbitrary, u = cx (c = constant) 


Then 


df = df.du = df. c 
dx du dx du 


For example 


And 


sin 3x = 3 cos 3x 
dx 


d / x 

— cos (cx) = — c sin cx 
dx 


Rule 6) for inverse functions is as follows. 

Let y = f(x) be defined and single-valued for a < x < b, having 
a derivative ^Oat some x = x in that interval. Put y 0 = f(x Q ). Sup¬ 
pose it is possible to find a single -valued, continuous function g(y) for y 
near y 0 such that y = f(g(y)) when y is near y 0 . g(y) is none other 
than one of the possible branches of the “inverse function” obtained 
by solving y = f(x) for x. Then 6) is to be interpreted as meaning 

dx 

that x = g(y) has a derivative — at y 0 , its value being 1 


dy 

dx 


dy 

being calculated at x 0 . The situation is illustrated below. 
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The proof is very simple. We have 


g(y + Ay) - g(y) _ x + Ax - x _ Ax _ (Ay) 


Ay 


which goes to 


Ay 


Ay Ax 


J_ 

dy 

dx 


Examples: 


1.) y = x 2 , 


Then 


Or 


? = 2x,so^ = ±^ = -L 

dx dy dy 2x 

d\/y 1 . d 1 

dy 2Vy dy W 2 W 


x = Vy 


which is none other than the general rule — x n = nx n_1 with n = 

dx 

1/2 and x replaced by y, which is just as good a letter as x. 


2.) y = sin x, 


x = arc sin y 


Then 


Or 


1 


1 


dx _ 1 _ 
dy dy cosx’ ± yfl — sin 2 x 
dx 


arc sin y = 


± 


dy 

3.) y = cos x, 

dx _ 1 _ __ i 

dy — sin x ± V1 — cos 2 x 


x = arc cos y 
i 
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or 


— (arc cos y) = ± — 
dy V yJ 


/ 


4.) y = tan x = , 

cos x 

Then 


x = arc tan y 


— = — 'xnd ^ — CQs2 x + s * n2 x 1 — s 


dy dy ? dx 
dx 

Thus 


cos* x 


cos* x 


dx 

dy 


= COS 2 X 


or 


d arc tan y 0 
-- = cos* x 


dy 


To eliminate x, y 2 = = l. ~ cos2 x = _J_j 

COS 2 X cos 2 X COS 2 X 

or 

1 


COS 2 X = 


1 + y : 


■, whence the formula 


d / > 1 

— (arc tan y) =- - - 

dy V yJ 1+y 2 


Let y = f(x) be defined and single-valued in some interval 
a x <C b. Suppose that f(x) has a local maximum at some point x 0 of 
that interval, meaning simply that f(x D + Ax) < f(x Q ) whenever 
Ax is sufficiently small. Then 


f(x 0 + Ax) — f(x 0 ) . > O when Ax < O 
Ax < O when Ax > O 

If f'(x) exists, it follows that its value must be > O if x = (x 0 + 
Ax) —> x c from the left, but must be < O if x —► x 0 from the right. 
The only possible value for f'(x 0 ) is therefore zero. Geometrically, 
this merely says that the tangent to the curve y = f(x)must be 
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horizontal at a local maximum, if the tangent exists to begin with. 
Similar considerations apply to local minima . 


Example 1 . 

cur only where 

f(x) — sin x. The maxima or minima can oc- 
f'(x) = cos x = O, hence only for 


x = ± rr/2, ± 3ir/2, ± bir/2, etc. 

Example 2 . 

f(x) = (x - 1) • (x 2 + 3x) 

By the product rule, f'(x) = (x - 1 ) (2x + 3) 4 - 1 • (x 2 + 3x) 

or 

f'(x) = 3x 2 + 4x — 3 

which 

— O when x = " 4 ± V^2 

6 

or approximately 

— 4 - 1-71 

x =--—— = .52, and — 1.85 

o 



The Mean Value Theorem 

Now suppose f(x) is a continuous function in a < x < b with 
f(a) = f(b) = O, and suppose that f'(x) exists for all x in a < x < b. 
Iff = constant, then f = O, since f = O at the end-points. In this 
case f' = O. If f ^ constant, then f(x) must have either a maximum 
or a minimum value for some £ between a, b (why?). Since f' (£) is 
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assumed to exist, it must be zero. Thus in any case f'(x) must 
vanish at some point between a, b (perhaps at several, or even all 
points). This fact is known as Rolle’s theorem. 



Now let f(x) be continuous in an interval a < x < b, and sup¬ 
pose that f'(x) exists at all points x with a < x < b. Put f(a) = A 
and f(b) = B. Define g(x) by 


g(x) = B + ^-^(A-B) 
(b - a) 


Then clearly f(x) — y(x) satisfies the conditions for Rolle’s 


theorem, and so its derivative f'(x) — y'(x) = f'(x) + 


A - B 
b — a 


g _ ^ 

fX x ) — — -must vanish for at least one x = £ between a, b, i.e., 


b - a 


o = f' (a - 


B - A 


or 


b — a 

f(b) = f(a) + f' (|) • (b - a) • (a < £ < b) 


This is known as the mean-value theorem of the differential cal¬ 
culus. From it we conclude at once that 

If f(x) is defined and single-valued in some interval x 0 < x < xi 
and if f'(x) exists and = O at all points of that interval (which forces f(x) 
to be continuous), then f(x) = constant in that interval. 

For taking x 0 < a < b < xi we have • , by the M. V. T., f(b) = 
f(a) -I- O • (b — a), or f(b) = f(a); and b, being arbitrary, f(x) = 
f(a) = constant. This result is of great importance. 


Higher Derivatives 

Once the first derivative has been taken, assuming that it exists, 
it represents another function. If one takes the derivative (or slope) 
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of the first derivative the result is called the second derivative. 

A( df\ dff 

dx \ dx / dx 2 

In the same manner the derivative of the second derivative is the 
third derivative (or the derivative operation applied 3 times). 


A IA [if II _ _d f d 2 f) _ dff 

dx 1 dx [dx J j dx [ dx 2 J dx 3 


The n‘» derivative is the operation of differentiation applied n 
times. 


d 3 terms" 


dx 1 dx 

As an example, consider 

f(x) = x 3 
df 


mi= 


d n f 

dx n 


and 


dx 

d 2 f 

dx 2 

dff 

ax 3 

dff 

dx 4 


= 3x 2 
= 6x 

= 6 

= O 


In general 
d 


, x n = nx n_1 
dx 

= n ( n - l)x n ~ 2 


d m 

dx m 


x “ -n(n - 1)(n - 2)-(n - m + l)x“- m ;(m<n) 
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Then 


-x n = O if n < m 

dx ra 


/ 


Another example is f(x) = sinx 

df 


dx 

dff 

dx 2 


= cos x 


= —sin x 


and so on. 


Maxima and Minima 


A necessary condition for a maximum 
or minimum point x Q in f(x) is that the 
slope of f(x) at x 0 be zero. 

If the slope vanishes this alone will not 
provide information as to whether the 
point of zero slope is a maximum or a 
minimum point in the curve f(x). 

Maxima have negative second derivatives. 
To see this pictorially plot a maximum 
point and the region about it. 

In addition plot the slope in this interval 
as a function of x. 

Notice that the maximum in f(x) has 


Hf 

a corresponding —curve whose slope (i.e., 
dx 


dx dx 


is negative. 


This situation was noted in Chapter 5 
when we found that the curvature matrix 
D 2 gave a negative result for curves con¬ 
cave downward. 
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Minima have a positive second deriva¬ 
tive at the point of zero slope. Again if 
df 

WC dx a ^ a * nst x we l ^ at a m i n i- 

mum point corresponds to a positive slope 

in at x 0 . 
dx 

The curvature matrix D 2 gave a posi¬ 
tive result for curves concave upward. 

Points of inflection. There exists a spe¬ 
cial case in which the function f(x) has a 

slope ^ of zero at x 0 ; 
dx 

and further 

has a zero second derivative. Points such 
as these are called points of inflection. 

Examples: 

Examine f(x) = 


1 


1 + x 2 

The point at x = O is a point of zero 
slope. 

df _ -2x 


dx (1 + x 2 ) 2 

B]-° 


at 


x = O 

The second derivative is 
-2 


d_ J df 

dx ( dx 


+ 


at 


2.2x(2x) 
(1 + X 2)2 ' (1 + x )3 

m =-* 

X = O 
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The curve has a maximum at x = O. 
Now consider 




This curve has a point of inflection at 
the point x = O. 


D. The Integral 

The Definite Integral 

\ 

In defining the approximate area under 
a curve in section A the method of rec¬ 
tangular partitioning was utilized. 

Our object at this stage is to define the 
exact area under a curve say f(x) in the 
interval from a to b. 

The procedure to follow will be repetitive in that we will review 
our previous development in somewhat greater detail. 
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In the diagram to the right we cut the interval (a, b) into say n 

pieces of length Axi, Ax 2 ,.Ax n ; possibly but not necessarily 

of equal length. 

In each interval we choose a point say x* in the interval Ax*. 

Xi_i < Xi < Xi. 


Now construct a rectangle of width Ax* 
and height f(xi). 

The rectangle provides an approxima¬ 
tion to the area under the curve in the I th 
interval Axj, and the area of the rec¬ 
tangle is 

(height) x (width) = f(xi) Axi 



We tacitly assume that f(xi) > O; 
otherwise of course the area comes out 

with a minus sign. In defining integrals we shall incorporate the 
possibility of having negative areas when f(xj) < O. 

The total area in the interval a —> b in the rectangle approxi¬ 
mation is then 


A = f(xi) Axi 

i = l 


Again! it is plausible that the value of A approaches some 
“Limiting value” as the subdivisions Axj of the interval a < x < b 
is made smaller and smaller. As the widths Axi are made smaller 
the number of rectangular sections, n, increases in a manner appro¬ 
priate to span the interval (a, b) 


A COARSE. 
f (x) SUBDIVISION 
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A FINEJR 





In order to define the “exact” area under f(x) in the interval 
a < x < b we allow the widths of the subdivisions to approach zero; 

Ax, —> O 

This of course obliges the number of subdivisions n to increase 
without limit. If N is a number which is 
arbitrarily large, then 


Limit f(xi) Ax, = The area under 

Axi —» O i= i f(x) in the in- 

n —> N terval a —> b 


As we mentioned previously areas lying 
under the x axis are to be considered 
negative. 

To exhibit this calculation of the sum 
in the limit let us utilize the simple curve 

y = x 

in the interval O < x < b. 




Let the subdivision points be xi, X 2 . 
x n _i, putting x 0 = O and x n = b. 

Then 

Axi = Xi — xi_i 


and f(xj) = Xi for this case. The area from x G to x n is then 

i=n i=n 

A — ^ f(xi) Axj = ^ xi Axj 


i = n 

A = ^ Xi (xi - Xi_i) 
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Since x*_i < x\ < xi, we have the relation 
Xi_iAxi < xiAxi < XjAxi 


or 


Xi—1 (Xi - Xi^i) < Xi (Xi - Xi_i) < Xi (Xi - Xi_i) 

Therefore the area in question lies between 

i=n i=n 

Ai = ^ xi_i (xj - Xj.j), and A 2 = ^ x, (xi - Xj_i) 




Expanding the terms in the two sums we find 

n n 

A i = ^ Xj_i • ^ ^ Xi-i 2 


and 


a 2 = ^xi 2 —y.x, • xui 

i i 

n n 

Hence Ai + A 2 = ^xi 2 — ^Xj_i 2 


i=l » = 


= (xi 2 + x 2 2 H-x n 2 ) — (x 0 2 + xi 2 H-x n _i 2 ) 

= x n 2 — x 0 2 = b 2 in this case 
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The sum Ai + A 2 is independent of the size and number of the 
subdivisions! 

It is interesting to note that the area under the curve y = x in 
the interval O < x < b is equal to the ^ 

mean value of the sum of the upper 
bound A 2 and the lower bound Ai. 


Area = - 


Ai + A 2 } = - b 2 


The difference A 2 — Ai is also of interest. 


-A.+A, 



Xj —1 Xi + y Xj- 1 2 

i i i 

= ]>/ Xi “ x *-i) 2 

i 

n 

= ]>|(Axj) 2 

i=l 

This algebraic result is illustrated graphically above. (A 2 — Ai) 
is the sum of the shaded squares which is merely the sum of the (Axj) 2 . 

We wish to see what happens as the Ax* tend to zero, n of course 
must at the same time increase without bound. This increase we in¬ 
dicate by the symbol 



n — > oo 

Let h be the greatest of the Ax*. Then Axi < h for all i, and 

(Axi) 2 < Axi h 

whence 

n n n 

^(Axi) 2 < ^Axjh = h^4xi = hb 

i= 1 i = 1 i= l 
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This relation provides an inequality for the difference A 2 — Aj 


A 2 — Aj = ^|(Axj) 2 < b x (max value of Axi) 

is i 

Therefore as Ax* —> O (this implies of course that the largest 
Axj —► O) it follows that 

A2 — Ai —> O 

Adding (A 2 + Ai) + (A 2 — Ai) = 2 A 2 
we get 

2 A 2 —» b 2 in the limit, Ax* —> O 

Subtracting (A 2 + Ai) — (A 2 - Ai) = 2A 1 we get 
2Ai —> b 2 in the limit, Ax* —» O 

In conclusion in the limit as Ax* —> O 

lim Ai = lim A 2 = A = - b 2 
Axi —» O Axi —> O 2 

Since 

Ai < A < A 2 , it follows that 

A = i-b 2 
2 

Accordingly we define A = (l/2)b 2 as the area under the curve 
y = x in the interval 0 to b. Clearly this definition gives us the usual 
area for a triangle. 

Needless to say the preceding calculation is cumbersome, and it 
involves a certain amount of manipulation which is limited to the 
specific function. One of our main goals at this point is to find a 
more convenient method for computing areas. 

The preceding example however does illustrate in a particularly 
simple manner the limiting process. 

Suppose that f(x) is defined and single-valued in a < x < b. 
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Divide this interval into n parts (not necessarily equal) by points 

x i> x 2> - x n _i. Put a = x 0 and b = x n , so that we have x G xi x 2 

. x n- The i th subinterval is then Xi_i < x < Xi. In that interval 

choose some point x*, in the manner that we discussed earlier. The 
situation is shown below. 

The area of the rectangle of 
height f(xj) and base Ax* = 

(^ - Xi_i) is f(xi) A^. Put 

n 

S=2>0.Axi (3) 

tsl 

(Such a sum is called a Riemann sum). Let Ax = max. of Ax^ ..., Ax n . 
We consider S as a function of Ax. (It is in general many-valued, be¬ 
cause, for one thing, the Xj can be chosen in many ways; so can the Xj). 



We now define 



lim S 

Ax —» 0( +) 


(4) 


provided that limit actually exists. The numerical value of (4) we 
interpret (by definition ) as the area under the curve y — f(x) from 
x = a to x = b, areas under the x-axis being reckoned negative. 
The dx appearing in (4) is, for the time being, a decoration. The 
point involved is that x may in turn be considered as a function of 
a new variable t, say with x running from a to b as t runs from a 
to /?. We can then consider f(x) = f(x(t)) as a function oft, and ac¬ 
cordingly we can define 

ff(x(t)) dt 


In general this will not have the same value as (4). Therefore just 
f(x) would not be an adequate symbol. 

A Fundamental Fact: The limit (4) exists if{(\) is continuous in a < x < b, 
and more generally if f(x) is continuous save for a finite number of ordinary 
jump discontinuities in a < x < b. 
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We omit the proof of this theorem. 

Among the various sums S in (4) are those for which all the Ax, 
are equal to — - and for which the Xj are chosen to be say at 

one end-point (xj_ x or Xj) or perhaps half way between them. If 

lim S exists for all possible S, then it must exist for the special 
Ax —» 0( + ) r 

ones just mentioned, and therefore in actually calculating integrals 
it is sufficient to consider only sums S of the special types. For ex¬ 
ample, our earlier calculation of f x dx can be done in that way 
as follows: Jo 

We take 

Xi = i ^ ?(i = 0 ’ 1 ’ 2 ’---’ n) 


so 


Axj = - 


We take 


Xi = Xi = 


i • b 


Then 


Now 


s = f( Xi ) Axi + ... + f(x n ) Ax n 


“ ] ir ‘ n + --- + i ^ n - = ^ 1 + 2 + 3 + "- n ) 

= £"^=| b ^=M' + ;) 


Ax 


O here means n —> 

S —* l/2b 2 as Ax 


oo, or-> O whence 

n 

O 


Consideration of the integral between definite limits a and b defines the 

definite integral 

which has an interpretation as an area between f(x) and the x axis in 
the interval a — > b. 
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Me shall see in the next section that an indefinite integral can be 
defined as the inverse operation to that of differentiation. 


The Indefinite Integral 

That the operation of integration is the inverse to the operation of 
differentiation has been indicated several times in previous discus¬ 
sions. In particular in section A we demonstrated that the slope of 
the integral (or area) function at a point was equal to the integrand 
evaluated at that point. 


To see this in a slightly different 
manner consider the definite integral 
from a to X of f(x) 


A(X) =f 


f(x) dx 


( 1 ) 



Now take the definite integral from a to (X + AX), 

f (X+AX) 

f(x) dx 

The area in the small interval AX we shall denote as AA. 


( 2 ) 


AA = 


/.(X+AX) r x 

= A(X + AX) - A(X) = J f(x)dx-J f(x)dx 

/»(X+AX) 

= J X f ^ dx (3) 


Note carefully that 


I 


X+AX 

f(x) dx 


is the area in the small trapezoid having a base AX. 
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Therefore the average value of f(x) in the 
interval AX is f. 


f(X) < f < f(X + AX) 


where 


. /.(X + iX) 


(4) 


We now divide both sides of equation (3) 
by AX. 


• <X+AX) 

f(x) dx = f 


AA = J_ f 
AX AX J x 

In the limit as AX approaches zero f approaches f(x); thus 

lim M = iA = ij m f _ f( X ) 

AX—» O AX dx Ax—»O 

Finally if we denote the integral by a function A(x) then the inte¬ 
grand f(x) is equal to the first derivative of the integral. 

This relation provides a convenient method for evaluating 
integrals. 

If r 

A(x) = J f(x') dx' 


then 


f(x) = — ( X ) 
dx 


To illustrate the usefulness of this equation let 
A(x) = x n 


then 


= nx n 1 . = f(x) 
dx 
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Because of this, the indefinite integral of nx n_1 is 
j nx n_1 dx = x n (where n^O) 


Using this method we see that the definite 
integrals in most instances can be evaluated 
from the indefinite integral. 

If 

A(x) = Jf(x') dx' 

then 

A(b) — A(a) = J* f(x) dx 

Use our example of f(x) = nx n_1 for 
n = 4 

Then 




A(x) 


-/• 


4x' 3 dx' = x 4 


and considering the area from x = 1 to x = 2. 

A(2) - A(l) = (2)4 - (1)“ = 16 - 1 = J 4x3 dx. = 15 

The area in the interval x = 1 to x = 2 is then 15. 

The case when n = O provides the natural logarithm function 
which we will take up in a later section. 

As another example consider A(x) = sin x; 
then 


dx 


and 


































to x f r„/7 ask f ° r ,he arca und ' r th ' cosin ' from X = o 

A(tt/ 2) - A(o) = sin n/2 - sin O = 1 = f cos x dx 

Jo 

of^!r a | d ' r 7u nCOUra u Sed '° investi S a,e one or more of the tables 
integrals which can be obtained. These tables have large com- 
pilations of the integrals for various functions. 


The Golden Rules 


The so-called golden rules for derivatives must have their anal¬ 
ogues m the case of integration. In this section the Golden Rules of 
d,ffere„,.a„ on wall he listed with their inverse relations for “ ,e 


’• t; (F« ± G(x)) = 


dF dG 

dx dx 


Then 

(F(x) + G(x)) = J(^dx±^dx) 

= f — s( x ) dx) 

( c F( x )) = c -g—; when c = a constant. 


In the same manner 

Jcf(x) dx = c J f(x) dx = c F(x) 

3. The product rule gives integration by parts. 


£(F‘G) = f.^ + g-^f 

dx dx dx 
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Then F(x) • G(x) = f F dG + f G(x) d F(x) 
or J J 

J*FdG = F(x) G(x) — J G(x) dF(x) 

To illustrate the integration by parts consider the definite integral 

r 


•ir/2 

x cos x dx 


Let F(x) — x and dG(x) = cos x dx where G(x) = sin x and dF = 
dx, then 

£ xc„sx = [xsinx[ — £ sin x dx = jxsinx + cosx[" 

= w/2 — O + cos tt/ 2 — cos O 
= tt/2 — 1 

4. The quotient rule if G(x) ^ O 


— (F/G) = ——_F dG 

dx G dx G 2 dx 


From this result 


fdF(x) _ F(x) f F(x) . 

J G(x) "G(x) + J lGWp dGW 


5. Chain Rule. 


then 


4 ( f(„(x))} 

ax du dx 


F(u(I)) = / ! *r 1 l d * -/*•>£* 
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To demonstrate this rule consider 


J*cos (1 -f X 2 ) • 2x dx = sin (1 + x 2 ) 
In this example F(u) = sin (u) and u = (1 -f x 2 ) 
Examples of integration. 

1. cos {sin x} = — J '{sin (sin x)} cos x dx 


2. sin 3x 

3. arc tan x 


=/ 


3 cos 3x dx 
dx 


=/ 


(1 + X 2 ) 


Multiple Integrals 

Because the function f(x) can be written as an integral our so- 
called one dimensional integral, which gives the area between f(x) 
and the x axis in the interval of interest, can be written as a multiple 
integral. 
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The area between x = a and x = b has 



integral 

r*> ( r^*) i /»b /*((*) 

A(b) - A(a) = J jj df'J dx = J J df'dx 

This method of writing the integral implies; 

First, that the trapezoidal section in dx be swept out by the 
operation 



Second, this trapezoid is summed from x = a 
to x = b. 


Note! 




We have defined the area between f(x) and the x axis by utilizing 
a lower limit on the df integration of 


f' = O 


The area between f(x) and the mirror 
image — f(x) could have been obtained by 
integrating df' from — f(x) to f(x). 


n f(*) r b 

. ,5 dx= J. 2,Wdx 
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This approach gives twice the area 
which is to be expected. 

We are now in a position to compute 
areas bounded by certain curves. 

Take the curve 


X 2 + y2 + R2 

This is a circle, defined by the 
equation of constraint, 


y(x) = ± \/R 2 — x 2 
The area is 





y-Vif-x* 


n Y = + VR 2 -X 2 r + R _ 

dy dx = 2 I VR 2 - x 2 dx 

= - J- R 


=2R £ ( ‘-# ) ' ,2dx 

Let x/R = sin £, then ^ = cos £ d £. 

R 

When x = ± R; sin £ = ± 1; and £ = ± tt/ 2. 
Substituting into the integral 


X v/2 

Vl — sin 2 £ cos£ d£ 

n/2 

/* r w/2 1 

= 2R 2 cos 2 £ d£ = 2R 2 A (1 + cos 2£) d£ 

J —*/2 J - m /2 2 


= 77 R 2 
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In general constraints in two dimensions are provided by equa¬ 
tions of the following type, 

y = y( x ) 

(for example y = ± \/R 2 — x 2 ) 

This problem is equivalent to that of finding the area contained 
between two curves yi(x) and y 2 (x), and between x = a and x = b. 

If we ask for the area enclosed 
between yi(x) and y 2 (x) in the 
interval a, b the result is 


n y 

i( 


•b /*y*(x) 

dx dy 

/a J yi(x) 


The order of differential elements indi¬ 
cates that the integration over y should 
be taken first. 

As an example compute the area be¬ 
tween 


Y2 = + X 2 


and 


yi = - x 2 


in the interval x = o to x = 2; 
then 

n +x* 

dx dy = 

x* 

fV- [-x2])dx=| X 3r 2 = 

J o •* Lo 


16 

3 
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The computation of volumes is performed in a similar manner. 
Assume that we are given a closed surface z = z(x, y) with a 
boundary condition. 



z min ( xy ) < 2 < z max ( xy ) 

then 

/* /• Ymw (X) /• Zm, (X, y) 

Volume =1 dx dy dz 

c/XmiB J Ynrfn (X) J (X, y) 

Consider as an example the sphere 
x 2 + y 2 + z 2 = R 2 
Then z = ± y/R 2 — x 2 — y 2 



with the projection of the cube dx dy dz on the xy plane having a 
maximum and minimum value given by 

y max = ± 
mm 
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Then 


r R r + v^i-x*-y* . 

Volume =| I | dx dy dz = ^ 77 R 3 

J- R J- VR*-* 2 J- \/R 2 —x 7 — y* 3 

In many cases the integrations are simpler in curvilinear co¬ 
ordinates. 

For example an area in polar coordinates utilizes an element of 
area 


dA = r dr d <j> 

In cylindrical coordinates the vol¬ 
ume element is 

dV = r dr d<f> dz 



In spherical coordinates the volume 
element is 

dV = r 2 dr sin 6 d 6 d<f> 


To illustrate the case in which an 
integral is computed in spherical co¬ 
ordinates let us compute the volume of 
a sphere. 

In spherical coordinates the equation 
for the spherical surface is 

r = R = a constant. 

Note that the 0 and <#> dependence do 
not appear. 



Volume = f f f r 2 dr sin 8 dO d<J> = — tt R 3 

Jo Jo Jo 3 
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E. Differential Equations 


The solution of differential equations is one of the major everyday 
tasks of the scientist. The field is large and at most this section of 
the discussion of the calculus can but give a slight feeling for the 
subject and some of the basic approaches. 

In many cases differential equations are solved by shrewd guesses 
or insight. In other words one assumes a solution and inserts this 
solution into the differential equation as a trial solution. Consistency 
in the final result may be the test of solution. On the other hand 
some problems are subject to straightforward solution by integration. 

An elementary example of this is the problem considered with 
the matrix operators in Chapter 4. We considered 

I)f(x) = g(x) 

where 


g(x) = x 

This problem was solved by multiplying by D -1 . However, this 
operation is equivalent to integration 

[)-i • 1) • f = f = D-i • g 

In terms of differential operators this equation becomes 
£f(x) = g(x) = X 

Note that D is now — 
dx 

Multiply by dx and integrate (equivalent D -1 ). 



= fdf = f(x) = f X x'dx' 

** J constant 


2 


+ constant 
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Because this section is not intended as a complete exposition it 
will be limited to the problem of developing the most common func¬ 
tions of mathematics and science, the exponential and trigonometric 
functions. 

When a differential equation is linear the derivatives appear only 
to the zeroth or first power. Note carefully the difference between 
degree and power. The equation 

S + cN + ^ = ° 

is of second degree, however it is linear. 

The equation 


4 . p + Qf = O, is of second degree 

dx 2 \dx/ 


Hf 

but is not linear; notice that is taken to the second power. 

dx 

Linear differential equations are subject to solution in series. 
Whether or not a solution exists at a point depends upon the func¬ 
tions p(x) and q(x) in 


£I + P( x)£ + q <x)f(x) = o 


This equation has solutions of the type, f(x) = x a ^>a n x n . 

n = 0 

Linear equations with constant coefficients have exponential or 
trigonometric solutions. 

Consider the differential equation 


df =f 
dx 
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and assume a series solution 

f=|> n 


Then 


—=2 a “ nxn_i + ox“ 

n=l m=0 

Substituting into the initial differential equation we obtain 

00 

2 {a m+1 (m + 1) - a m } x m = O 


Since the value of x is arbitrary, this equation can only vanish 
if each coefficient of x vanishes. 


Thus 


with 


a m+i = -2m- 

m + 1 

a l — a o> 

a 2 = = -*2_ 

2 2 1-2 

a 3 = — = — — 

3 1-2-3 


a n = —y 
n! 


The solution of this equation is then, 

f(x) = 3o V-yX" 
n! 


= a 0 e* 
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The sum defines the exponential to the power x; 


00 



where 


/ 


i °° 

de* _ V"_n 
dx ^-Jn! 


x n-l 



m = 0 


At this point we should note that this series solution defines the 


natural logarithmic integral. 
Rearrange the differential equation, 


to be 



We know that f(x) = e* thus 


Therefore 


X = loge e x = loge f(x) 

Jy = *°ge f(x) 


Just as the exponential function was derived; the sine and cosine 
functions are defined by the linear differential equation 


dfg 

dx 2 


+ g = O 
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Assume a solution 


g( x ) = 2 bnXn 

n=0 

then 

A2 30 » 

n ( n ~ l) x " -2 = ^b m+2 (m + 2) (m + l)x m 

n=2 m=0 

Substituting into the original equation we find the relation 

X 

^{(m + 2) (m + 1) b m+2 + b m } x m = O 

m=0 


Again the coefficient of x m must vanish giving, 

b m+2 =-zb's- 

(m + 2 )(m+l) 

Two different series are defined from this relation. If we assign a 
non zero value to b 0 all of the even b’s are defined; 


and 


b 2 


_ -b 0 
2-1 


b 4 = —^ = +-—- 

4-3 4•3•2•1 


b2n 


(— 1 ) p b 0 
(2n)! 


This first series is called the even series 


even 

g (x) = bp cos x 


= b »S 

n = 0 


(~D P 

(2n)! 


x 2n 
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and defines the cosine (an even function of x); 


cos x 


V"> (-l) p x 2p 

ik (2n)! 


An even function is defined as a function which does not change 
sign when x is replaced by — x. 


Note: 


cos( — x) 


— l) n ( —x) 2n 1) D x 2n 

(2n)! (2n)! 

n=0 V ' n=0 v 


— + COS X 


The second series (the odd series) is obtained by setting bi; then 
all of the odd b’s are defined, 


b3 = 


-bx 

3-2 


bi 


and 


b 5 = —— = + 

5*4 5 • 4•3•2 


, _ (— 1 ) p bx 

2n+1 (2n + 1)! 


The odd series defines the sine function (an odd function of x) 


g(xf= bi sin x = bi 2(^+1)! X<2 “ +1> 
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An odd function of x changes sign when x is replaced by (— x). 


sin(x) = 

n =o 


_i _U n x (2n+l) 

(2n + 1)! 


—sin ( —x) 

fk ( 2 - + ')! 


The reader now has three basic series at his disposal; that for e*, 
cos x, and sin x. Using the imaginary \/ — 1 = i, we are able to prove 
that 

e* x = cos x + i sin x 


by direct substitution into the series for the exponential. The result 
should separate into two series; one with the coefficient + 1, and 
the other with the coefficient i = yj — 1. 

The relation shown above can be also demonstrated except for a 
sign by showing that e ix 

is a solution of + g = O 

dx 2 8 


In closing we should realize in the case of the last equation that 
there are two independent solutions 

g even = b 0 cos x 

and 

g° dd = bi sin x 


The most general solution to this differential equation is a linear 
combination of the independent solutions 


g(x) = b G cos x + bi sin x 


The values of b 0 and bi are set by the particular problem to be 
solved. 







displacement 

from 

equilibrium 
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As an example consider a mass 
M sliding on a frictionless table 
and connected to a rigid vertical 
wall by a spring having a spring 
constant k. 


According to Newton’s 2nd Law the mass times the acceleration 
of the body is equal to the force F acting upon the body (F = 

— kx, where x = the displacement of the system from equilibrium). 



— kx 


divide by M and define r 

then 

d 2 x 

dr 2 



t 


— x 


The solution is 


x(t) = b 0 cos t + bi sin r 

= b D cos [(^() I/2 ‘] "*■ bl ^ [(m) 121 ] 

If we now stipulate that the system starts from rest (— = O at 

dt 

t = O) with an initial amplitude L, we can determine b 0 and bi. 
The initial velocity and displacement of the system are called 


At 


Thus 


the initial conditions. 


t = O 

x(O) = L = b 0 cos O + bj sin O = b 0 



b 0 = L 
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and 


b a = O 

Our solution with these initial conditions is 


x(t) = L cos t 

M 


One further remark should be included. The reader will notice 
the similarity between this problem and say the eigenvalue prob¬ 
lem in two dimensions (the ellipse). Both problems have a set of 
independent solutions; two eigenvectors in the case of the ellipse 
and two independent functions in the case of the differential 
equation. 

The two orthogonal eigenvectors of the ellipse problem form a 
set of base vectors in two dimensions. As a result any arbitrary 
vector in that space can be represented in terms of its projection 
along the two bases. 

The two independent solutions of the differential equation in a 
sense form a set of bases in the respect that any arbitrary config¬ 
uration of the system described (of the spring system in our partic¬ 
ular example) can be expanddcf in 

Think about it. 


base functions. 


F. Applications ofok 



The kinematical quantities, posiT 
require explicit use of the differenf 
proper definition. 

To provide a systematic development we shall first develop these 
quantities in one dimension and then extend our definition to two 
dimensions. The formulation of velocities and accelerations in three 
dimensions is a simple extension of the two dimensional problem. 

To further simplify the problem we shall use as an example to 
illustrate the technique the motion of a particle with constant 
acceleration. 
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Velocity in One Dimension 

Suppose that we are observing an object mov¬ 
ing in a straight line, and that we obtain the dis¬ 
tance of the object from an arbitrary origin O as 
a function of time. 

The average velocity in a time interval (t 2 — ti) 
is defined as 

_ x(t 2 ) ~ x(ti) 

The path of the object in space-time is shown in the graph above. 
We see that the average velocity is the chord approximation for the 
slope of the x(t) curve in the interval (t 2 — ti). 

The instantaneous velocity at a given time ti can be obtained in 
the limit as the interval (t 2 — ti) goes to zero. 

Let (t 2 — ti) = At 

Then t 2 = ti + At 

and 

v = lim *(■. + At) ~ »('■> = Hra >(*■ + A') 

At—» O (ti + St) - ti At —» O At 

or 



The instantaneous velocity is the slope of the space-time curve. 
To illustrate this definition first consider a body moving with con¬ 
stant velocity 

x(t) = X 0 + V 0 t 

Then 





fc t+Ai 


t 


































Another simple example arises when we consider the curve x(t) 
which is quadratic in the time 

x(t) = x„ + v 0 t + | a 0 t 2 

In this case 


v (t) — — — v 0 + a 0 t 

In general we can plot a corresponding velocity-time curve, and 
in this simple case the velocity is linear in time. 



The acceleration is defined as the time rate of change of the ve¬ 
locity. In order to compute the acceleration we must use the veloc¬ 
ity-time curve as shown below. Obviously if only the x(t) curve is 
given we must obtain the v(t) by differentiation. 
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The average acceleration a in the time interval (t 2 — ti) is de¬ 
fined as 


_ v(t 2 ) ~ v(ti) 

(t 2 - ti) 


As in the case of the velocity the instantaneous acceleration “a” 
at a time say t x is obtained by letting the time interval At = 
(t 2 — t x ), approach zero in the limit 


a — Urn »(t + y~»W = | im v (■ + A.) - v(t) 
At ^ O (t + At) - t At —» O At 


or 


a 


dv 

dt 


The acceleration is the slope of the velocity curve in one dimension. 


dx 

Because v = — we can write 


dt 


a = A { v } = A (dx } = d^x 
dt 1 ’ dt l dt' dt 2 


and we see that the acceleration is the second derivative of the space- 
time curve x(t). 

Again we can use the previous examples to illustrate the calcula¬ 
tion of the acceleration. 

In the case of the function 


x(t) = Xo + v 0 t and 
dx 

v(t) = — = v 0 = a const 
dt 
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This is the function representing motion with constant velocity and 
therefore zero acceleration. 

The second example, 

x(t) = x 0 + v 0 t + ^ a 0 t 2 
and v(t) = v 0 + a 0 t 

giving a(t) = ^ = a 0 = a constant 

dt 

Thus this function represents motion with constant acceleration. 


Displacement in Two Dimensions 


The graphical representation of the motion of a particle in two 
dimensions is not as easily displayed as in one dimension. If and 
ri represent the position of a particle at times t 2 and ti respectively, 
then the displacement in the time interval (t 2 — ti) is 


r 2 ri = r(t 2 ) - r(ti) 

Since 

r = r(t) = x(t) i + y(t) j 


We can write 



r 2 - ri = {x(t 2 ) - x(ti)} i + {y(t 2 ) - y(ti)} j 


Velocity in Two Dimensions 

Strictly speaking the term velocity is a vector quantity having 
both direction and magnitude. The magnitude of the velocity is 
called the speed. 
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Thus in the previous one dimensional problem we were actually 
computing the speed. 

The diagram in section (a) above can be utilized to compute the 
average velocity in the time interval (t 2 — ti) in a manner similar 
to that used for the one dimensional problem. 


V ave 


r 2 - ri _ r(t 2 ) - r(tq) 

t2 — ti t2 — ti 


_ ( x(t 2 ) - x(ti) j j + j y(t 2 ) - y(ti) I . 

1 t 2 — ti J j t 2 - ti ) J 

= V x i + Vy j 


The instantaneous vector ve¬ 
locity is defined by letting the 
time interval (t 2 — ti) = At and 
then to let At —> O in the limit. 


v 


lim r < 1+A ;) — r w = 
At ^ O (t + At) - t 


lim 

At —» O 


r (t + At) — r (t) 
At 



Thus 


v = 


lim 
At—► 


Ar 

OAt 


dr 

dt 



The final form in terms of the time derivatives of the components 
x and y assumes that the base vectors i and j are held constant in 
time. 

It is possible to define the components of r in terms of a set of 
rotating base vectors. Under such conditions we must also consider 
the time derivatives of the bases. 
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The time dependent position vector representing motion with con¬ 
stant velocity v 0 is 



apACE-TIML 
curve 


SPACE 

CURVE 


r (t) = r 0 + v 0 t 


Then 



v 0 = v ox i + v oy j = a constant 
vector 


Motion with constant acceleration is represented by 
r (t) = r 0 + v Q t + |aot 2 

here r 0 , v c , and a G 
are all constant vectors. 

Let 


r 0 = O; (starts from the origin) 
v 0 = (v 0 cos a)i + (v 0 sin a)j 

and 

= “gj 

The equation for r then represents the position of a projectile fired 
with an initial velocity v 0 and in a direction inclined at an angle a 
relative to the horizontal. 

The acceleration will be shown to be equal in magnitude to g the 
acceleration of gravity, and the acceleration is directed in the neg¬ 
ative j direction. 
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Solving for the velocity in this special case we find 
dr 


dt 


= v 0 + aot 


= (v 0 cos a} i + {(v 0 sin a) - gt} j 


We note that the particle moves in the i direction with a con¬ 
stant velocity (v G cos a). 

The velocity in the y direction is linear in the time. 

The use of base vectors enables us to keep track of the two com¬ 
ponents of the motion with one set of symbols. 


Acceleration in Two Dimensions 


The acceleration is defined in the velocity-time space. Again we 
recognize the advantages of our unit vectors i and j. Because they 
are dimensionless and serve to indicate direction only, the velocity 
space can be super-posed upon the position space. 

The average acceleration in a time interval (t 2 — ti) is 


a ave 


V (t 2 ) — V (tl) 

(t2 — tl) 
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As we let the time interval become ar¬ 
bitrarily small we define the instantaneous 
acceleration at a time t 

a = lim v (t + At) — v (t) 

At —» O (t x At) - t 



Ay 

O At 


dv 

d i 



Since 


v 


dr 

dt 


d dr 
dt dt 


d 2 r 

dt^ 


d 2 x 

dt^ 


i + 


i!y) 

dt 2 J 


j 


In our example of motion with constant velocity v 0 . 
r (t) = r 0 + v 0 t 


and 


dr 

v (t) = — = v 0 = a constant vector 
dt 


a(t) = 0 


Our second example dealt with a projectile fired from the origin 
of coordinates (r G = O) with an initial velocity v 0 = (v 0 cos a)i + 
(v 0 sin a) j. 

r (t) = v 0 t + | a 0 t 2 

v = = v 0 + a 0 t 

dt 

where 
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= ~gj 


[g = 32.2 ft/sec 2 , or g = 980 cm/sec 2 ] 



The instantaneous acceleration is 
„ dv 

a (l) = — = ao = - gJ 

and is a constant vector, directed downward. 

We note that v (t) is tangent to the orbit y = y(x). 

The orbit equation can be obtained by eliminating the para¬ 
meter t in x(t) and y(t). 

Since 

r (t) = x(t) i + y (t) j = (v 0 cos a) i + {v 0 sin a - | gt 2 } j 
Equating coefficients of i and of j we obtain 
x(t) = (v 0 cos a) t 
y(t) = (v 0 sin a) t - | gt 2 

By eliminating the parameter t in these two simultaneous alge¬ 
braic equations we obtain 

y = (tan a) x - - - —\ 2 

2 \ v 0 cos a / 

The maximum height h to which the projectile rises is obtained 
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by finding the time th at which the y component of v(t) vanishes 




giving 

Then 


v y = v 0 sin a — gt 
O = v 0 sin a — gt h 


th 


v 0 sin a 
g 


h = y (th) 


1 (v 0 sin a) 2 

2 r - 


The maximum distance D traveled in the x direction is of course 
obtained by finding the time t G at which y is zero. 

Since 

y (t) = (v 0 sin a) t - 1 gt 2 
O = (v 0 sin a - | gt D } t D 


One solution is t = O since the particle starts at y = O. The 
other solution is 


Then 


tD 


2 v 0 sin a 
g 


x(t D ) = D = 


2 V 2 sin a cos a 
g 


270 


Index 


A * 

Acceleration — 177, 262, 264, 266, 
267 

Addition of vectors — 66 , 67, 91, 124 
Analytic geometry — 8 , 101 
Angle between two lines — 108 
Angular momentum — 77, 79, 81 
Angular velocity — 79, 81 
Anti-derivative — 208 
Area — 182, 189, 203, 204, 233, 234, 
235, 242, 243, 248, 249 
Area under the slope curve — 195 
Associative law — 32, 68 
Asymptotes—119 
Average acceleration — 263 
Average velocity — 261 
Axes — 8 

Axioms of E 3 — 2, 5 

B 

Base vectors — 84, 86 , 96, 265 

C 

Carousel problem — 21, 26 
Cartesian — 8 
Cartesian bases — 90 
Cartesian components — 90 
Centrifugal force — 22 
Chain rule — 221,224, 245 
Characteristic equation — 141, 144, 
148 

Circle — 117 


Closed interval — 167 
Cofactors — 46 
Column matrix — 34, 96 
Commutative law — 32, 67 
Computer — 185 
Conic sections — 102 , 123 
Conjugate hyperbola — 120 
Continuity — 185, 186, 187, 214 
Coriolis acceleration — 22 
Coordinates — 6 

Coordinate transformations — 24 
Coupled oscillators — 156 
Cross product — 74 
Curvature — 181, 182, 231 
Curves — 114 

Curvilinear coordinates — 16, 21 
Cyclic order — 12 
Cylindrical coordinates — 17, 19 

D 

Definite integral — 240, 245 
Derivative — 203, 207, 217, 218, 230 
Determinant form, triple scalar prod¬ 
uct — 96 

Determinant form, vector product — 
95 

Determinant of a matrix — 45, 46 
Diagonal of a quadratic — 134, 136 
Diagonalization—136, 142, 144 
Difference of two vectors — 68 
Differential calculus— 172, 203 
Differential equations — 252, 253 
Direction cosines — 90 
Discontinuity — 214 


271 















Discriminant — 130, 136 
Distance of a point to a plane — 113 
Distributive law — 32, 68 
Dot product — 71, 88, 92, 98, 106 
Double valued functions — 169 

E 

Eccentricity — 117, 119 
Eigenvalue problem — 147, 260 
Eigenvalues — 134, 141, 143, 148 
Eigenvectors — 144, 260 
Einstein — 59 

Einstein summation convention — 28, 
42,90 

Electron — 61 

Ellipse—102, 115, 130, 140, 145 
Equal vectors — 65 
Euclidean coordinates — 9 
Even function — 257 
Even series — 256 
Events — 50 

Expansion theorem — 84 
Exponential function — 254, 255 

F 

Focii — 116 

G 

Gauss — 5 

Golden rules of differentiation — 220 
Golden rules of integration — 244 
Graphs — 166 

H 

Half life — 61 

Hyperbola—102, 118, 130, 140, 146 

I 

Identity matrix — 33, 122 
Indefinite integral — 241 


Indices — 41 
Inflection — 231 
Initial conditions — 259 
Inner product — 71, 98, 122 
Integral — 203, 207, 233, 239, 240, 
241,242, 246 

Integration by parts — 244 
Intersection of two planes — 110 
Interval of definition — 167 
Invariance — 48, 49, 54 
Inverse derivative matrix — 184 
Inverse function rule — 221 
Inverse matrix — 42, 43, 46 
Inversion — 12 

K 

Kinematics — 260 
Kronecker delta — 33, 34, 44 

L 

Lee, T. D. — 15 
Left-handed coordinates — 11 
Length element — 18, 20 
Length of a vector — 64 
Limit — 207, 209, 211, 213, 216, 234, 
238 

Limit from left — 212, 218 
Limit from right — 212, 218 
Limits, properties of — 215 
Linear differential equations — 253 
Linear independence — 83, 84, 86 
Linear momentum — 78 
Linear transformation — 26 
Local maximum — 227 
Local minimum — 228 
Loci — 102 
Locus— 115 

Logarithmic integral — 255 
Lorentz transformation — 55, 59 

M 

Magnitude of a matrix — 33, 130, 132 


Magnitude of a vector — 64, 71, 72 
Mathematical objects — 2 
Matrices — 28, 29, 97 
Maxima — 231 
Mean value — 206, 237 
Mean value theorem — 228, 229 
Minima — 231, 232 
Multiple integrals — 246 
Multiplication of matrices — 38, 39 
Multiplication of vectors — 71 
Multiplicity — 169 
Mu meson — 61 

N 

Negative of a vector — 65 
Neutrino — 61 

Non-orthogonal coordinates — 9 
Normal form, ellipse — 116, 135 
Null vector — 64 

O 

Odd function — 257, 258 
Odd series — 257 
Off diagonal terms — 129 
Open interval — 167 
Orthogonal base vectors — 87, 92 
Orthogonal coordinates — 7 
Orthogonal eigenvectors — 150, 155 
Orthogonal matrices — 47, 48 
Orthogonal transformations — 47, 48, 
50, 56 

P 

Parabola — 102, 120, 130 
Parallelogram method — 67 
Parametric equations — 105, 109 
Parity — 13 

Parity experiments — 15 
Plane—110, 112 
Points of inflection — 232 
Polar coordinates — 16, 17, 18 
Polynomial — 222, 223 


Position vector — 98 
Principle of relativity — 52, 53 
Product rule — 221 
Products, matrix — 39, 40 
Projection of one vector on another — 
71, 85, 87 

Projection of a line on a plane — 108 
Pythagorean Theorem — 6 

Q 

Quadratic form — 121, 122, 128 
Quotient rule — 221, 223, 245 

R 

Rectangular partitioning — 190, 233 
Relative motion — 53 
Repeated product — 142 
Resultant matrix — 39 
Resultant vector — 91 
Riemann sum — 239 
Right hand convention — 89, 94 
Right handed coordinates — 10, 11, 
76,89 

Rollers theorem — 229 
Rotations — 24, 97, 99 
Rotations of matrices — 131 
Rotations of vectors — 100 

s 

Scalar — 65, 97 
Scalar product — 71, 88 , 92 
Second derivative — 230 
Second order tensors — 97 
Secular determinant — 141 
Semi-major axis — 117, 129 
Semi-minor axis — 117, 129 
Sequence — 210 
Similarity transformation — 133 
Single valued functions — 169 
Slope of a curve — 172, 208, 217, 231 
Slope of a straight line — 108 
Slope matrix — 178, 180 


273 








Space 

Euclidean — 2 
Local — 2 

Three dimensional — 2 , 104 
Two dimensional — 2, 104 
Space-time — 50 
Special relativity — 50, 55, 59 
Speed of light — 52, 53 
Spherical coordinates — 17-20 
Step function — 186 
Straight lines — 102, 104, 110 
Subspace — 83 

Subtraction of two vectors — 68 , 92 
Symmetry point — 123 

T 

Tangent line — 217 
Tangential velocity — 79, 80, 81 
Tensors — 97 
Time dilation — 60 
Trace — 132, 133 
Translation — 24, 123 
Transpose Matrix — 33, 47, 48 
Trapezoidal partitioning — 192, 193, 
241 


Triple scalar product — 81, 82 
Triple valued functions — 169 
Triple vector product — 83, 85 
Triplet of numbers — 6, 7, 8, 12 

u 

Unit base vectors — 89 
Unit matrix — 33, 34, 44 
Unit vector — 66, 105 

V 

Vector — 35, 63, 97 
Vector product — 74, 94 
Vector transformations — 96 
Velocity — 69, 78, 177, 261, 264, 265 
Volume — 250, 251 

W 

Work —73 

Y 

Yang, F. — 15 


274 










